Introduction to MongoDB world

MongoDB is an opensource , schema less , nosql database. It uses a form of JSON to store and manipulate its data. Datais actully stored in BSON format which is acronym of binary Json. BSON comes with some utilities on top of JSON.

Utilities like ObjectId and Date.

1) Mongodb comes with some utiliteis.mongo is availabe as installer and zip.If you opt for zip you can have its bin directory in your classpath. Once we have mongo in our system we can start it with mongod utility.

example ::
<path/to/mongodb/bin>mongod –dbpath path/where/you/want to keep your/database/files (when mongodb/bin is not in classpath)
any/location>mongod –dbpath path/where/you/want to keep your/database/files (when mongodb/bin is in classpath)
This command starts a mongodb server which listens to client at port 27017.

2) Mongo process is a client to connect to mongod server. It is javascript console. So we need to write javascript to work with mongodb database.
>mongo
connecting to: test
on executiont o above command mongodb connects to a test database by default.
once we execute db command on console it shows the database to which we are currently connected.
>db
test

3) mongoexport is a process which is used to export data from mongodb to a JSON file.
4) mongoimport takes a json file and creates data in side mongodb collections.
5)mongodump creates a db dump same as mongoexport but it keeps data as BSON.
6) mongorestore – you guessed it right
7) bsondump command is used to conver a BSON to JSON.
8) mongostat is another utility to keep track of server startistics.

*****************************************
Mongodb being nosql is different for RDBMS. We will first try to unsertand some terminologies specific to mongodb. Thsi will help us understand the working principles a bit.

1) database same as RDBMS. we use word db in queries.
2) collections: They refer to tables but they do not have schema. It means for a collection we need not define column names ,their datatypes, their constants. Now if it is so we van have two records in smae collection with entirely differnt data , which is not a very good idea but is technically possible. So to avoid this we need to enforce schema in server code but mongodb doesnot enfore any schema.
3) Document : A document can be consodered as a row or record.
4) Field : It is {Key :value} pair. Can be considered as a column and its value.

Now we should start working with mongodb.
*********************************************
> mongo (assuming mongodb/bin is present in classpath)
This command will conect to test database.
>show dbs (shows all database present)
>use testdb (A database or collection is not created unless we start inserting documents. So use command updates a javascript variable db )
>db (executing this command shows the value currectly assigned to db variable.)
testdb
>db.cities.count(); –> this command is supposed to provide with the number of documents present in cities collection. As no records is inserted in cities it will provide 0 as result.
0

INSERT :-

>db.books.insert({title:’The immortals of meluha’,author:’Amish Tripathy’,tags:[‘shiva’,’mythology’],description:’A great work of fiction.’,saved_on: new Date()});
–> one of the values is array
–> both array and sub doc is possible inside a document.
–> _id is unique and immutable
–> db.books.find() will have an additional field _id it is id of this row is by default indexed. It is generated if not provided.
–> _id is made from a combination of time-stamp , process-id , host-name and a random increment number.
>db.books.find()[0]._id.getTimestamp(); will provide the time on which this record was inserted or to be more precise the time objectId was created.
>db.books.find().pretty(); will make json look good and organized.

As we are in a javascript console we can use javascript way to insert a record.
>var doc = {};
>doc.title = “Secret of the Nagas”;
>doc.author = “Amish Tripathy”;
>doc.tags = [‘shiva’,’mythology’];
>doc.description = ‘sequel of “The immortals of meluha”‘;
>doc.saved_on = newDate();

>db.books.save(doc);
–> save command is like upsert. It will check if _id field is present in the document if so it will update the document else it will insert a new record with a new _id field.
–> The question which comes now is why so complex id? what’s wrong with a single incrementing number? can we have a single incrementing number as id in mongodb?
–> Mongodb is designed to work in a clustered environment and it is really difficult to keep track of an incrementing number in clustered environment to a complex way is designed to have very high probability of having a unique number.
–> we can have any thing except for an array to be a id.

>db.books.insert({_id : 100, title: ‘The oath of Vayuputras’});

–> To have a single incrementing number as Id we need to create a js function.
>function counter(name){
var ret = db.counters.findAndModify({query : {_id:name},update : {$inc : { val : 1}},’new’ : true , upsert : true});
return ret.val;
}
>db.products.insert({_id : counter(‘products’), title: ‘Product 1’});
>db.products.insert({_id : counter(‘products’), title: ‘Product 2’});
>db.products.insert({_id : counter(‘products’), title: ‘Product 3’});
>db.products.find().pretty();

–> observe the id’s of last three records inserted are in sequence.
–> we will discuss about the findAndModify in detail in my future post. This was just to show that we can have a sequence to generate ids if needed .
–> In mongodb we do not have joins to get data from two collections based on primary key foreign key relationships. The reason being a join is too expensive if used in large cluster of servers. So as we cannot use joins we need to change our view on database normalization when we work with mongodb.
–> As we know in Relational database we tend to have 3rd normal form to normalize our tables. but sometimes its nice to have a denormalized table. For example in our books collection we have tags , which contains an array , it is highly unlikely that we will need to change data in a tag once it is inserted. However if we have to update data in tags it will be a highly expensive operation for mongodb.

–> If we require to have two collections related to each other. For example: lets have a users collection
> db.users.insert({name:’sharad’});

–> now we want to have this user to be linked with a product in products collection. to do this we have following operation.
> var a = db.users.findOne({name:’sharad’}); — we will look into findOne in my next post
> db.products.insert({_id : counter(‘products’), title: ‘Product 3’,user : a._id}); — linking (normalized)

–> in the above example we linked two documents. To retrieve them we will have to execute two queries as joins do not work in mongodb.
>db.products.insert({_id : counter(‘products’), title: ‘Product 3′,user : {name:’sharad’}}); — embedding (denormalized)

–> There are two ways to assign relationship linking and embedding.
–> If we are using our db for mostly read operation then having it denormalized will boost performance and if we are using most update operations then having it normalized will give better performance.

  • Ask Question