This is a set of evolving thoughts and discoveries about NoSQL – when to use it – when not to…
At a high level , the table shows the NoSQL analagous concepts to familiar SQL terms.
|Relational Model (SQL)||NoSQL model ( e.g. Mongo DB )|
|row||document or BSON document|
|table joins||embedded documents and linking|
Specify any unique column or column combination as primary key.
In Mongo DB, the primary key is automatically set to the _id field.
|aggregation (e.g. group by)||aggregation framework
See the SQL to Aggregation Framework Mapping Chart.
Overview – Mainly around why we DESIGNED RELATIONAL databases in the first place
Relational databases helped eliminate data duplication (and errors related to duplication of data – for e.g. – having to update the SAME data in multiple places). As it happens, NoSQL in fact does not overcome this shortcoming – and in fact, if your app queries consist of lots of updates, NoSQL is not as efficient a solution as relational (and will be as error prone as flat files were prior to relational dbs).
However, if your app queries are primarily READS – then going through a set of RELATIONSHIPS to read a single record of user data – is a lot of overhead. And NoSQL wins hands down – by storing all the relevant data in ONE record.
E.g. – Author Information and Book Information – In SQL you would separate these into different tables – and provide a RELATIONSHIP between the two – so that any updates to Author Info – only needs to happen in a single place – and will automatically be reflected in any BOOK query and AUTHOR query as well.
In No SQL, if the Author’s name changes, then the info will need to be updated in the AuthorInfo and the BookInfo NoSQL records (since they both contain the author name). Multiple places to update data – more error prone (there are a few patterns to work around this – but the basic issue remains).
In NoSQL , you can define relationships (it is hard to do anything meaningful with data without defining relationships). Relationships are of two types – GUID based – and Parent Child based. Say – you want to associate UserComments with a User. You would define Comments as a child of User – and it would be a Parent-Child relationship. Now, you could run queries (searches is more appropriate, since QUERY applies more to the relational model) against this parent child COLLECTION of objects.
Both models rely on indices.
Summary – When to Use Relational , When to Use NoSQL – Tradeoff
The important thing to understand, is how to perform efficient queries and to understand the referential integrity tradeoff that is made when we de-normalize to achieve high performance applications. Always consider the application query patterns when designing the data model.