Introduction to MongoDB NoSQL database for SQL Developers – Part 2

January 30th, 2014 / No Comments » / by admin

In the first instalment to this series I discussed some core aspects of MongoDB NoSQL database installation and configuration. In the next three posts (this one and two others – post 3 HERE and post 4 HERE) I would like to focus on using Mongo’s Shell to outline some basic syntax intricacies regarding working with documents and collections as well as querying the collections’ data.

Before diving into Mongo’s syntax a word or two about data storage aspects. The following concepts best reflect Mongo’s approach to how data is stored and the equivalent constructs in relational database world.

  • A document is a basic unit of data for MongoDB and is roughly equivalent to a row in a relational table.
  • Similarly, a collection can be thought of as a table with a dynamic schema.
  • A single instance of MongoDB can host multiple independent databases, each of which can have its own collections.
  • Every document has a special key, ‘_id’, that is unique within a collection.
  • MongoDB supports a rich, ad-hoc query language of its own but does not support SQL or SQL-style expressions.
  • MongoDB stores data is a JSON-like objects comprising the data model, rather than RDBMS tables. Significantly, MongoDB supports neither joins nor transactions. However, it features secondary indexes, an expressive query language, atomic writes on a per-document level, and fully-consistent reads.

You will have to admit that this is quite a paradigm shift for someone heavily entrenched in what Edgar Codd defined 30 or so years ago and what has effectively became the standard for the majority of data storage engines. But given that MongoDB is one of the fastest growing NoSQL databases in the world by a variety of metrics, it goes to show that significant technology investment by companies such as SAP, EBay, Craigslist, GitHub, New York Times and many, many more would not be without its merit.

Let’s get to the nitty-gritty now and go through some examples outlining Mongo’s data storage construct and syntax used to access it. MongoDB, as mentioned previously, stores data in a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.

MongoDB-part2-crud-annotated-document

Documents, on the other hand, are stored in collections. A collection is a group of related documents that have a set of shared common indexes. Collections are analogous to a table in relational databases.

MongoDB-part2-crud-annotated-collection

Basic CRUD operations in MongoDB (excluding RETRIEVE) modify the data of a single collection. For the update and delete operations, you can specify the criteria to select the documents to update or remove. In the following diagram, the insert operation adds a new document to the users collection.

MongoDB-part2-crud-insert-stages

Let’s go through some specific scenarios where data in the designated collection is modified via one of the WRITE operations. There are three classes of write operations in MongoDB: INSERT, UPDATE and REMOVE. Insert operations add new data to a collection. Update operations modify existing data, and remove operations delete data from a collection. No insert, update, or remove can affect more than one document atomically.

Create operations add new documents to a collection. In MongoDB, the db.collection.insert() method performs create operations. In addition, both the db.collection.update() method and the db.collection.save() method can also add new documents through an operation called an upsert. An upsert is an operation that performs either an update of an existing document or an insert of a new document if the document to modify does not exist. The following diagram highlights the components of a MongoDB insert operation.

MongoDB-part2-MG-insert

If you add a new document without the _id field, the client library or the Mongo instance adds an _id field and populates the field with a unique ObjectId. If you specify the _id field, the value must be unique within the collection, otherwise Mongo returns a duplicate key exception.

The following diagram shows the same query in SQL.

MongoDB-part2-SQL-insert

Update operations modify existing documents in a collection. In MongoDB, db.collection.update() and the db.collection.save() methods perform update operations. The db.collection.update() method can accept query criteria to determine which documents to update as well as an option to update multiple rows. The method can also accept options that affect its behavior such as the ‘multi’ option to update multiple documents. The following diagram highlights the components of a MongoDB update operation.

MongoDB-part2-MG-updateThe following diagram shows the same query in SQL.

MongoDB-part2-SQL-updateDelete operations remove documents from a collection. In MongoDB, db.collection.remove() method performs delete operations. The db.collection.remove() method can accept query criteria to determine which documents to remove. The following diagram highlights the components of a MongoDB remove operation.

MongoDB-part2-MG-deleteThe following diagram shows the same query in SQL.

MongoDB-part2-SQL-delete

This is a basic overview of the core CRUD syntax constructs that MongoDB uses to interact with the database collections and documents. Next, let put this knowledge into practice and run a few statements on the actual database itself. Below is a sample code showcasing a few CRUD commands in Mongo’s ‘foo’ database (with exception of a few lines of SQL thrown in for comparison).

--Switch to database named 'foo':
use foo;

--Insert three documents into 'purchase_hist' collection:
db.purchase_hist.insert([{
  "_id" : ObjectId("52ce3648a4a0bb15a8e07b75"),
  "CustomerID" : 23424,
  "First_Name" : "Mary",
  "Middle_Name" : "Joeanne",
  "Last_Name" : "Black",
  "Email_Address" : "mary0120@yahoo.com.au",
  "Phone_Number" : "03-8573-9455",
  "Purchased_Units" : 1,
  "Purchased_Value" : 29.99,
  "Item_SKU" : "RMK973",
  "Rec_Created_Date" : ISODate("2011-12-31T13:00:00Z")
},
{
  "_id" : ObjectId("52ce3648a4a0bb15a8e07b76"),
  "CustomerID" : 84933,
  "First_Name" : "John",
  "Middle_Name" : "Lorance",
  "Last_Name" : "Moore",
  "Email_Address" : "johnnym@awol.com",
  "Phone_Number" : "03-3423-1155",
  "Purchased_Units" : 1,
  "Purchased_Value" : 49.0,
  "Item_SKU" : "MKK833",
  "Rec_Created_Date" : ISODate("2011-12-31T13:00:00Z")
},
{
  "_id" : ObjectId("52ce3648a4a0bb15a8e07b77"),
  "CustomerID" : 19583,
  "First_Name" : "Martin",
  "Middle_Name" : null,
  "Last_Name" : "Laser",
  "Email_Address" : "mlaser91@aol.com",
  "Phone_Number" : "03-2355-1109",
  "Purchased_Units" : 5,
  "Purchased_Value" : 2099.49,
  "Item_SKU" : "HHY009",
  "Rec_Created_Date" : ISODate("2012-03-31T13:00:00Z")
}]);

--Show inserted documents from 'purchase_hist' collection:
db.purchase_hist.find();

--Show values of the document to be updated from 'purchase_hist' collection:
db.purchase_hist.find({
    "Middle_Name": "Joeanne"
});

--Update document with ID '52ce3648a4a0bb15a8e07b75' with a new 'Phone_Number' value:
db.purchase_hist.update(
	{Middle_Name: "Joeanne"},
	{$set: {Phone_Number: "03-9393-6100"}},
	{multi: true});

--Which is equivalent to the following SQL in RDBMS:
UPDATE purchase_hist
SET Phone_Number = '03-9393-6100'
WHERE Middle_Name = 'Joeanne'

--Show updated document with the new value:
db.purchase_hist.find({
    "Middle_Name": "Joeanne"
});

--Show document with Item SKU 'HHY009' value:
db.purchase_hist.find({
    "Item_SKU": "HHY009"
});

--Remove a document with Item SKU 'HHY009' value:
db.purchase_hist.remove(
	{Item_SKU: "HHY009"});

--Which is equivalent to the following SQL in RDBMS:
DELETE FROM purchase_hist
WHERE Item_SKU = 'HHY009'

--Show that document with Item SKU 'HHY009' value has been removed:
db.purchase_hist.find({
	Item_SKU: "HHY009"
});

Running this code should present you with the following output in a Windows command line pane.

MongoDB-part2-crud-foodb-execoutput-annoted

In the NEXT post I will go over some key commands and their variables used for data extraction.

Tags: , ,

Introduction to MongoDB NoSQL database for SQL Developers – Part 1

January 30th, 2014 / No Comments » / by admin

Note: This series is the first part of a collection of post depicting MongoDB NoSQL database key features and functionality. Further posts can be found HERE (part 2), HERE (part 3) and HERE (part 4).

Introduction

Coming from mainly relational database background, NoSQL movement seemed like a bit of a fad when it first started making rounds in the early 2000 and my immediate though was that perhaps with the exception of a few hippy Silicon Valley startups, it will end up being just a flash in the pan. It is 2014 and it turns out that not only has it proven to be a viable data storage solution for many enterprise-level and beyond (don’t know how one would refer to the likes of Google or Amazon) companies but it has been slowly making inroads into more conservative environments, where long-term reliance on RDBMSs from vendors such as Oracle, IBM or Microsoft seemed undisturbed.

Working in IT, it has been both, a curse and an asset to see changes proliferation throughout the industry, with new programing languages and frameworks popping up like wild mushrooms. However, for a typical database professional, since the inception of Codd’s famous relational model, things have been pretty steady for quite some time. SQL language has been the golden standard for all major database vendors and providing one could master basic methodologies and principles of database development and management, working as a DBA or database developer has been a steady and nonfluctuating field to be in. That’s until recently. Not long ago I wrote about the new data (warehouse) management frontier emerging in a form of a cloud deployment (link HERE). In this post I would like to explore one variant of the NoSQL databases plethora – document-oriented MongoDB.

MongoDB Database Characteristics

A quick Wikipedia lookup under NoSQL outlines a rich ecosystem of NoSQL database with some major categories such as column, key-value, document or graph database. MongoDB falls under document-oriented database, characterised by high performance, scalability and flexibility and low complexity and variable functionality. MongoDB is a free and open-source software and has been adopted as backend database by a number of major websites and services, including Craigslist, eBay, Foursquare, SourceForge, and The New York Times, among others. Main features include indexing support, MapReduce support for batch processing of data and aggregation operations, file storage as well as built-in replication and load balancing. Programmatic access is available in a number of languages such as C, C++, C#, Erlang, Java, Python, Ruby, and more.

MongoDB makes it (arguably) easy for someone accustomed to thinking in terms of tables, columns and rows as the mapping between those is quite straightforward. A ‘table’, represented in relational terms, equals to a ‘collection’ in MongoDB whereas a ‘row’ is represented by a ‘document’; everything else i.e. database, indexes, primary keys etc. definitions tend to overlap with few distinct differences. Another distinct feature is the fact that MongoDB is a schema-less database – a document’s key and values are not fixed types or sizes, therefore adding and removing fields (columns) becomes easier.

Finally, queries executed on MongoDB follow different syntax rules then those created using declarative SQL. MongoDB, by default, uses JSON-style notation which can be quite confusing at first attempt, regardless of your SQL expertise level. There have been a number of projects to simplify or standardise querying NoSQL data stores and few services exist where you can parse SQL query to have it converted into MongoDB syntax e.g. HERE, however, for me personally, it was easier to pick up MDX or DMX (languages used for OLAP and data mining queries) which by virtue of their syntax were quite a departure from my comfort zone i.e. ‘classic’ SQL, then to run with MongoDB native query implementation structure. That’s not to say that with a little bit of persistence and time it cannot be mastered; additionally, an occasional paradigm shift from the archetypical SQL provides a breath of fresh to otherwise very conventional field.

MongoDB Database Installation Overview

Without further ado let’s look at MongoDB in more detail. In this video I will show you how to download, install and perform basic configuration on a MongoDB instance.

Also, if you’re interested in Mongo’s query syntax essentials and would like to use it with something a little bit less primitive then the console window, please check out the remaining three posts to this series HERE (part 2), HERE (part 3) and HERE (part 4).

Tags: , ,