Introduction to MongoDB NoSQL database for SQL Developers – Part 3

January 30th, 2014 / No Comments » / by admin

In the last post I outlined some basic MongoDB CRUD commands and their relationship/mapping to Structured Query Language. In this post I will go over some key commands and their variables used for data extraction.

Read operations, or queries, retrieve data stored in the database. In MongoDB, queries select documents from a single collection. Queries specify criteria, or conditions, that identify the documents that MongoDB returns to the clients. A query may include a projection that specifies the fields from the matching documents to return. The projection limits the amount of data that MongoDB returns to the client over the network.

For query operations, MongoDB provide a db.collection.find() method. The method accepts both the query criteria and projections and returns a cursor to the matching documents. You can optionally modify the query to impose limits, skips, and sort orders. The following diagram highlights the components of a MongoDB query operation.

MongoDB-part3-crud-annotated-findThe next diagram shows the same query in SQL.

MongoDB-part3-crud-annotated-SQL-select

MongoDB queries exhibit the following behavior:

  • All queries in MongoDB address a single collection.
  • You can modify the query to impose limits, skips, and sort orders.
  • The order of documents returned by a query is not defined and is not defined unless you specify a sort().
  • Operations that modify existing documents (i.e. updates) use the same query syntax as queries to select documents to update.
  • In aggregation pipeline, the $match pipeline stage provides access to MongoDB queries.

MongoDB provides a db.collection.findOne() method as a special case of find() that returns a single document. Consider the following diagram of the query process that specifies a query criteria and a sort modifier.

MongoDB-part3-crud-query-stages

In the diagram, the query selects documents from the users collection. Using a query selection operator to define the conditions for matching documents, the query selects documents that have age greater than (i.e. $gt) 18. Then the sort() modifier sorts the results by age in ascending order.

Queries in MongoDB return all fields in all matching documents by default. To limit the amount of data that MongoDB sends to applications, include a projection in the queries. By projecting results with a subset of fields, applications reduce their network overhead and processing requirements. Projections, which are the the second argument to the find() method, may either specify a list of fields to return or list fields to exclude in the result documents. Consider the following diagram of the query process that specifies a query criteria and a projection.

MongoDB-part3-crud-query-w-projection-stages

In the above diagram, the query selects from the users collection. The criteria matches the documents that have age equal to 18. Then the projection specifies that only the name field should return in the matching documents.

Naturally, MongoDB also supports other query constructs which are used to achieve further functionality for data extraction and retrieval e.g. grouping, case statements, ordering etc. Below is a graphical depiction of the sample SQL statements and their corresponding equivalent in the syntax used by MongoDB.

MongoDB-part3-MySQL-to-MongoDB-mapping

In the next post I will dive deeper into MongoDB’s query syntax and introduce a couple of handy tools which can be used for basic administrative and data retrieval tasks.

Tags: , ,

Introduction to MongoDB NoSQL database for SQL Developers – Part 2

January 30th, 2014 / No Comments » / by admin

In the first instalment to this series I discussed some core aspects of MongoDB NoSQL database installation and configuration. In the next three posts (this one and two others – post 3 HERE and post 4 HERE) I would like to focus on using Mongo’s Shell to outline some basic syntax intricacies regarding working with documents and collections as well as querying the collections’ data.

Before diving into Mongo’s syntax a word or two about data storage aspects. The following concepts best reflect Mongo’s approach to how data is stored and the equivalent constructs in relational database world.

  • A document is a basic unit of data for MongoDB and is roughly equivalent to a row in a relational table.
  • Similarly, a collection can be thought of as a table with a dynamic schema.
  • A single instance of MongoDB can host multiple independent databases, each of which can have its own collections.
  • Every document has a special key, ‘_id’, that is unique within a collection.
  • MongoDB supports a rich, ad-hoc query language of its own but does not support SQL or SQL-style expressions.
  • MongoDB stores data is a JSON-like objects comprising the data model, rather than RDBMS tables. Significantly, MongoDB supports neither joins nor transactions. However, it features secondary indexes, an expressive query language, atomic writes on a per-document level, and fully-consistent reads.

You will have to admit that this is quite a paradigm shift for someone heavily entrenched in what Edgar Codd defined 30 or so years ago and what has effectively became the standard for the majority of data storage engines. But given that MongoDB is one of the fastest growing NoSQL databases in the world by a variety of metrics, it goes to show that significant technology investment by companies such as SAP, EBay, Craigslist, GitHub, New York Times and many, many more would not be without its merit.

Let’s get to the nitty-gritty now and go through some examples outlining Mongo’s data storage construct and syntax used to access it. MongoDB, as mentioned previously, stores data in a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.

MongoDB-part2-crud-annotated-document

Documents, on the other hand, are stored in collections. A collection is a group of related documents that have a set of shared common indexes. Collections are analogous to a table in relational databases.

MongoDB-part2-crud-annotated-collection

Basic CRUD operations in MongoDB (excluding RETRIEVE) modify the data of a single collection. For the update and delete operations, you can specify the criteria to select the documents to update or remove. In the following diagram, the insert operation adds a new document to the users collection.

MongoDB-part2-crud-insert-stages

Let’s go through some specific scenarios where data in the designated collection is modified via one of the WRITE operations. There are three classes of write operations in MongoDB: INSERT, UPDATE and REMOVE. Insert operations add new data to a collection. Update operations modify existing data, and remove operations delete data from a collection. No insert, update, or remove can affect more than one document atomically.

Create operations add new documents to a collection. In MongoDB, the db.collection.insert() method performs create operations. In addition, both the db.collection.update() method and the db.collection.save() method can also add new documents through an operation called an upsert. An upsert is an operation that performs either an update of an existing document or an insert of a new document if the document to modify does not exist. The following diagram highlights the components of a MongoDB insert operation.

MongoDB-part2-MG-insert

If you add a new document without the _id field, the client library or the Mongo instance adds an _id field and populates the field with a unique ObjectId. If you specify the _id field, the value must be unique within the collection, otherwise Mongo returns a duplicate key exception.

The following diagram shows the same query in SQL.

MongoDB-part2-SQL-insert

Update operations modify existing documents in a collection. In MongoDB, db.collection.update() and the db.collection.save() methods perform update operations. The db.collection.update() method can accept query criteria to determine which documents to update as well as an option to update multiple rows. The method can also accept options that affect its behavior such as the ‘multi’ option to update multiple documents. The following diagram highlights the components of a MongoDB update operation.

MongoDB-part2-MG-updateThe following diagram shows the same query in SQL.

MongoDB-part2-SQL-updateDelete operations remove documents from a collection. In MongoDB, db.collection.remove() method performs delete operations. The db.collection.remove() method can accept query criteria to determine which documents to remove. The following diagram highlights the components of a MongoDB remove operation.

MongoDB-part2-MG-deleteThe following diagram shows the same query in SQL.

MongoDB-part2-SQL-delete

This is a basic overview of the core CRUD syntax constructs that MongoDB uses to interact with the database collections and documents. Next, let put this knowledge into practice and run a few statements on the actual database itself. Below is a sample code showcasing a few CRUD commands in Mongo’s ‘foo’ database (with exception of a few lines of SQL thrown in for comparison).

--Switch to database named 'foo':
use foo;

--Insert three documents into 'purchase_hist' collection:
db.purchase_hist.insert([{
  "_id" : ObjectId("52ce3648a4a0bb15a8e07b75"),
  "CustomerID" : 23424,
  "First_Name" : "Mary",
  "Middle_Name" : "Joeanne",
  "Last_Name" : "Black",
  "Email_Address" : "mary0120@yahoo.com.au",
  "Phone_Number" : "03-8573-9455",
  "Purchased_Units" : 1,
  "Purchased_Value" : 29.99,
  "Item_SKU" : "RMK973",
  "Rec_Created_Date" : ISODate("2011-12-31T13:00:00Z")
},
{
  "_id" : ObjectId("52ce3648a4a0bb15a8e07b76"),
  "CustomerID" : 84933,
  "First_Name" : "John",
  "Middle_Name" : "Lorance",
  "Last_Name" : "Moore",
  "Email_Address" : "johnnym@awol.com",
  "Phone_Number" : "03-3423-1155",
  "Purchased_Units" : 1,
  "Purchased_Value" : 49.0,
  "Item_SKU" : "MKK833",
  "Rec_Created_Date" : ISODate("2011-12-31T13:00:00Z")
},
{
  "_id" : ObjectId("52ce3648a4a0bb15a8e07b77"),
  "CustomerID" : 19583,
  "First_Name" : "Martin",
  "Middle_Name" : null,
  "Last_Name" : "Laser",
  "Email_Address" : "mlaser91@aol.com",
  "Phone_Number" : "03-2355-1109",
  "Purchased_Units" : 5,
  "Purchased_Value" : 2099.49,
  "Item_SKU" : "HHY009",
  "Rec_Created_Date" : ISODate("2012-03-31T13:00:00Z")
}]);

--Show inserted documents from 'purchase_hist' collection:
db.purchase_hist.find();

--Show values of the document to be updated from 'purchase_hist' collection:
db.purchase_hist.find({
    "Middle_Name": "Joeanne"
});

--Update document with ID '52ce3648a4a0bb15a8e07b75' with a new 'Phone_Number' value:
db.purchase_hist.update(
	{Middle_Name: "Joeanne"},
	{$set: {Phone_Number: "03-9393-6100"}},
	{multi: true});

--Which is equivalent to the following SQL in RDBMS:
UPDATE purchase_hist
SET Phone_Number = '03-9393-6100'
WHERE Middle_Name = 'Joeanne'

--Show updated document with the new value:
db.purchase_hist.find({
    "Middle_Name": "Joeanne"
});

--Show document with Item SKU 'HHY009' value:
db.purchase_hist.find({
    "Item_SKU": "HHY009"
});

--Remove a document with Item SKU 'HHY009' value:
db.purchase_hist.remove(
	{Item_SKU: "HHY009"});

--Which is equivalent to the following SQL in RDBMS:
DELETE FROM purchase_hist
WHERE Item_SKU = 'HHY009'

--Show that document with Item SKU 'HHY009' value has been removed:
db.purchase_hist.find({
	Item_SKU: "HHY009"
});

Running this code should present you with the following output in a Windows command line pane.

MongoDB-part2-crud-foodb-execoutput-annoted

In the NEXT post I will go over some key commands and their variables used for data extraction.

Tags: , ,