Introduction to MongoDB NoSQL database for SQL Developers – Part 2
January 30th, 2014 / No Comments » / by admin
In the first instalment to this series I discussed some core aspects of MongoDB NoSQL database installation and configuration. In the next three posts (this one and two others – post 3 HERE and post 4 HERE) I would like to focus on using Mongo’s Shell to outline some basic syntax intricacies regarding working with documents and collections as well as querying the collections’ data.
Before diving into Mongo’s syntax a word or two about data storage aspects. The following concepts best reflect Mongo’s approach to how data is stored and the equivalent constructs in relational database world.
- A document is a basic unit of data for MongoDB and is roughly equivalent to a row in a relational table.
- Similarly, a collection can be thought of as a table with a dynamic schema.
- A single instance of MongoDB can host multiple independent databases, each of which can have its own collections.
- Every document has a special key, ‘_id’, that is unique within a collection.
- MongoDB supports a rich, ad-hoc query language of its own but does not support SQL or SQL-style expressions.
- MongoDB stores data is a JSON-like objects comprising the data model, rather than RDBMS tables. Significantly, MongoDB supports neither joins nor transactions. However, it features secondary indexes, an expressive query language, atomic writes on a per-document level, and fully-consistent reads.
You will have to admit that this is quite a paradigm shift for someone heavily entrenched in what Edgar Codd defined 30 or so years ago and what has effectively became the standard for the majority of data storage engines. But given that MongoDB is one of the fastest growing NoSQL databases in the world by a variety of metrics, it goes to show that significant technology investment by companies such as SAP, EBay, Craigslist, GitHub, New York Times and many, many more would not be without its merit.
Let’s get to the nitty-gritty now and go through some examples outlining Mongo’s data storage construct and syntax used to access it. MongoDB, as mentioned previously, stores data in a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
Documents, on the other hand, are stored in collections. A collection is a group of related documents that have a set of shared common indexes. Collections are analogous to a table in relational databases.
Basic CRUD operations in MongoDB (excluding RETRIEVE) modify the data of a single collection. For the update and delete operations, you can specify the criteria to select the documents to update or remove. In the following diagram, the insert operation adds a new document to the users collection.
Let’s go through some specific scenarios where data in the designated collection is modified via one of the WRITE operations. There are three classes of write operations in MongoDB: INSERT, UPDATE and REMOVE. Insert operations add new data to a collection. Update operations modify existing data, and remove operations delete data from a collection. No insert, update, or remove can affect more than one document atomically.
Create operations add new documents to a collection. In MongoDB, the db.collection.insert() method performs create operations. In addition, both the db.collection.update() method and the db.collection.save() method can also add new documents through an operation called an upsert. An upsert is an operation that performs either an update of an existing document or an insert of a new document if the document to modify does not exist. The following diagram highlights the components of a MongoDB insert operation.
If you add a new document without the _id field, the client library or the Mongo instance adds an _id field and populates the field with a unique ObjectId. If you specify the _id field, the value must be unique within the collection, otherwise Mongo returns a duplicate key exception.
The following diagram shows the same query in SQL.
Update operations modify existing documents in a collection. In MongoDB, db.collection.update() and the db.collection.save() methods perform update operations. The db.collection.update() method can accept query criteria to determine which documents to update as well as an option to update multiple rows. The method can also accept options that affect its behavior such as the ‘multi’ option to update multiple documents. The following diagram highlights the components of a MongoDB update operation.
The following diagram shows the same query in SQL.
Delete operations remove documents from a collection. In MongoDB, db.collection.remove() method performs delete operations. The db.collection.remove() method can accept query criteria to determine which documents to remove. The following diagram highlights the components of a MongoDB remove operation.
The following diagram shows the same query in SQL.
This is a basic overview of the core CRUD syntax constructs that MongoDB uses to interact with the database collections and documents. Next, let put this knowledge into practice and run a few statements on the actual database itself. Below is a sample code showcasing a few CRUD commands in Mongo’s ‘foo’ database (with exception of a few lines of SQL thrown in for comparison).
--Switch to database named 'foo': use foo; --Insert three documents into 'purchase_hist' collection: db.purchase_hist.insert([{ "_id" : ObjectId("52ce3648a4a0bb15a8e07b75"), "CustomerID" : 23424, "First_Name" : "Mary", "Middle_Name" : "Joeanne", "Last_Name" : "Black", "Email_Address" : "mary0120@yahoo.com.au", "Phone_Number" : "03-8573-9455", "Purchased_Units" : 1, "Purchased_Value" : 29.99, "Item_SKU" : "RMK973", "Rec_Created_Date" : ISODate("2011-12-31T13:00:00Z") }, { "_id" : ObjectId("52ce3648a4a0bb15a8e07b76"), "CustomerID" : 84933, "First_Name" : "John", "Middle_Name" : "Lorance", "Last_Name" : "Moore", "Email_Address" : "johnnym@awol.com", "Phone_Number" : "03-3423-1155", "Purchased_Units" : 1, "Purchased_Value" : 49.0, "Item_SKU" : "MKK833", "Rec_Created_Date" : ISODate("2011-12-31T13:00:00Z") }, { "_id" : ObjectId("52ce3648a4a0bb15a8e07b77"), "CustomerID" : 19583, "First_Name" : "Martin", "Middle_Name" : null, "Last_Name" : "Laser", "Email_Address" : "mlaser91@aol.com", "Phone_Number" : "03-2355-1109", "Purchased_Units" : 5, "Purchased_Value" : 2099.49, "Item_SKU" : "HHY009", "Rec_Created_Date" : ISODate("2012-03-31T13:00:00Z") }]); --Show inserted documents from 'purchase_hist' collection: db.purchase_hist.find(); --Show values of the document to be updated from 'purchase_hist' collection: db.purchase_hist.find({ "Middle_Name": "Joeanne" }); --Update document with ID '52ce3648a4a0bb15a8e07b75' with a new 'Phone_Number' value: db.purchase_hist.update( {Middle_Name: "Joeanne"}, {$set: {Phone_Number: "03-9393-6100"}}, {multi: true}); --Which is equivalent to the following SQL in RDBMS: UPDATE purchase_hist SET Phone_Number = '03-9393-6100' WHERE Middle_Name = 'Joeanne' --Show updated document with the new value: db.purchase_hist.find({ "Middle_Name": "Joeanne" }); --Show document with Item SKU 'HHY009' value: db.purchase_hist.find({ "Item_SKU": "HHY009" }); --Remove a document with Item SKU 'HHY009' value: db.purchase_hist.remove( {Item_SKU: "HHY009"}); --Which is equivalent to the following SQL in RDBMS: DELETE FROM purchase_hist WHERE Item_SKU = 'HHY009' --Show that document with Item SKU 'HHY009' value has been removed: db.purchase_hist.find({ Item_SKU: "HHY009" });
Running this code should present you with the following output in a Windows command line pane.
In the NEXT post I will go over some key commands and their variables used for data extraction.