[MongoDB]: Overview

• A Document database
• Developed by MongoDB Inc.
• Latest version is 3.0.2 (released on April 9, 2015)
• Runs on most platforms such as Red Hat Enterprise, CentOS, Fedora Linux, Ubuntu, Debian, Linux, OS X, Windows
• Supports 32 and 64 bit architectures.
• Clients supply data to MongoDB in JSON format
• Language drivers and client libraries avaialble for: Java, C, C++, C#, Erlang, JavaScript, Groovy, Clojure, Perl, PHP, Python, Ruby, Scala.

JSON Documents:

{ _id:1, name: ‘Ahmad’, gender: ‘M’, dept: ‘Fin’}
{ _id:2, name: ‘Bajrang’, gender: ‘M’, dept: ‘Sales’}
{ _id:3, name: ‘Catherine’, gender: ‘F’, dept: ‘HR’}
{ _id:4, name: ‘Dostoyevski’, gender: ‘M’, dept: ‘Prod’}
Getting Started

•Starting MongoDB server
C:\mongodb\bin\mongod.exe
•Starting MongoDB shell
C:\mongodb\bin\mongo.exe
•Viewing the name of the current database
db
•Viewing the list of databases
show dbs
•Creating a new database / using an existing database
use dbversity
Persons Documents

db.persons.insert ( {
name: {
first: ‘Harish’,
last: ‘Chandra’ },

gender: ‘M’,
yearOfBirth: 1962,
livesIn: ‘Mumbai’,

countriesVisited: [
‘India’, ‘Singapore’, ‘Thailand’,
‘United Kingdom’, ‘Spain’, ‘Denmark’,
‘United States of America’],

languages: [
{name: ‘Hindi’, proficiency: ‘Fluent’},
{name: ‘English’, proficiency: ‘Fluent’},
{name: ‘Sanskrit’, proficiency:’Intermediate’} ]
})
Write-ahead logging to an on-disk journal…
…to guarantee write operation durability and to provide crash resiliency.
When a write operation occurs:

 MongoDB data about the write to the private view in RAM and then copies the same to the journal on disk, in batches called group commits, by default every 100 milliseconds
 It then applies the changes to the shared view (in memory), which now becomes inconsistent with the data files.
 At default intervals of 60 seconds, MongoDB flushes the shared view to disk, and removes the write operations from the journal.

db.persons.insert ( {
name: {
first: ‘Harish’,
last: ‘Chandra’ },

gender: ‘M’,
yearOfBirth: 1962,
livesIn: ‘Mumbai’,

countriesVisited: [
‘India’, ‘Singapore’, ‘Thailand’,
‘United Kingdom’, ‘Spain’, ‘Denmark’,
‘United States of America’],

languages: [
{name: ‘Hindi’, proficiency: ‘Fluent’},
{name: ‘English’, proficiency: ‘Fluent’},
{name: ‘Sanskrit’, proficiency:’Intermediate’} ]
})
Schema Design :

Traditional, Relational Database Design

Usually highly normalized
Eliminates problems of data duplication, redundancy
However, too many tables results in expensive joins
Too many simultaneous operations involving too many joins can degrade performance significantly.
Database design is application independent. Provides more flexibility, but may not meet application-specific or feature-specific performance requirements
Strong transaction support.

MongoDB Database Design

Denormalize the data. What needs to be presented / processed together, is preferred to be kept together within a document. Pre-joined / embedded data
Application-specific design
May result in data duplication, and related issues
Schemaless design => flexibility in document structure
Does not mean that there is no schema planned
No constraints such as foreign key, unique, etc
Atomic operations within a document, but no ACID transactions

The _id Field

Reserved for use as a primary key
Value can be specified by you. Example:
db.persons.insert ( {_id: 101, name: {first: ‘Robinson’, last: ‘Crusoe’} } )
If a value is not specified by you, the insert () method adds it to the document with a unique ObjectId for its value
ObjectId is a 12-byte unique identifier.
Value must be unique in a collection
Is immutable
One-to-Many: Relational Style Design :

db.posts.insert ( {
_id: 1,
title: “My First Post”,
author: “charlie”,
date: new Date (“25-Apr-2014”),
post: “This is my first post”
} )

db.comments.insert( {
_id: 1,
postid: 1,
author: “andrew”,
comment: “this is funny”,
order: 1
} )

One-to-Many: Embedding Data :
db.posts.insert ( {
_id: 1,
title: “My First Post”,
author: “charlie”,
date: new Date (“25-Apr-2014”),
post: “This is my first post”,
comments: [
{user: “andrew”, comment: “this is funny”},
{user: “dilshad”, comment: “Welcome to blogging”}
]
} )
One-to-Many: Embedding Data
Employee: Qualifications
Employee: Work experience
Employee: Dependents
Customer: Orders
Book: Reviews
But, what about:
City: Citizens (millions of them)
Employee: Daily attendance data for many years

One-to-Many: Not Embedding Data :
db.citizens.insert ( {name: “kamal”, city: “Mumbai”} )
db.cities.insert ( {_id: “Mumbai”,
population: “1.4 million”} )
Many-to-Many

Books: Authors
Students: Courses
Posts: Tags
Students-Courses
db.students.insert ( {_id: 1, name: “Amar”, courses: [2, 3, 4] } )
db.students.insert ( {_id: 2, name: “Akbar”, courses: [1, 2] } )
db.students.insert ( {_id: 3, name: “Anthony”} )
db.students.insert ( {_id: 4, name: “Seeta”, courses: [1, 4] } )
db.students.insert ( {_id: 5, name: “Geeta”, courses: [3] } )
db.courses.insert ( {_id: 1, title: “African History”, students: [2, 4] } )
db.courses.insert ( {_id: 2, title: “Anthropology”, students: [1, 2] } )
db.courses.insert ( {_id: 3, title: “Indian Constitution”, students: [1, 5] } )
db.courses.insert ( {_id: 4, title: “European Geography”, students: [1, 4] } )
db.courses.insert ( {_id: 5, title: “Space Exploration”} )
Many-to-Many: Embedding
We can completely embed data from one side into another (Students into Courses or vice verse) but the following issues arise:

Update anomalies
What if the embedded object needs to exist without any ¡§parent¡¨ object gets created
Blog Site: Schema Design :
db.posts.insert ( {

title: “My First Post”,
author: “charlie”,
date: new Date (“25-Apr-2014”),
post: “This is my first post”,
tags: [“JSON”, “MongoDB”],

comments: [
{user: “andrew”,
comment: “this is funny”},
{user: “dilshad”,
comment: “Welcome to blogging”}

])

db.users.insert ( {_id: “andrew”, name: {first: “Andrew”, last: “Weiss”}, email: “andrew@example.com”} )
db.users.insert ( {_id: “bharat”, name: {first: “Bharat”, last: “Kumar”}, email: “bharat@example.com”} )
db.users.insert ( {_id: “charlie”, name: {first: “Charlie”, last: “Gordon”}, email: “charlie@example.com”} )
db.users.insert ( {_id: “dilshad”, name: {first: “Dilshad”, last: “Khan”}, email: “dilshad@example.com”} )
E-Commerce Catalog: Schema
db.categories.insert ( {_id: 1,name: “Clothing and accessories”} )
db.categories.insert ( {_id: 2,name: “Boys”,ancestors: [1] } )
db.categories.insert ( {_id: 3,name: “Shirts”,ancestors: [1, 2] } )
db.products.insert( {_id: 1,name: “612 Ivy League Boys Shirts”,price: 799, category: 3} )

Benefits of Embedding

Data is pre-joined (denormalized)
Single round trip to the database
Improved read and write performance

But, Size of a document increasing during update => relocation => drop in performance
Capped Collections

Fixed-size collections.

Data stored on disk in the insertion order.
Work like circular buffers. Once the collection reaches its capacity, the oldest document is written over.
Retrieval of data in insertion order provides high throughput.
Updates that increase the document size are prohibited.
Example: oplog.rs, which stores a log of operations in a replica set, is a capped collection.
Good for storing log information.
You cannot delete documents from a capped collection, except by dropping the collection altogether.
Cannot be sharded.

Creating a Capped Collection :

db.createCollection (‘eventlog’, {capped: true,size: 64*1024, max: 32} )
db.eventlog.insert ( {description: ‘some dummy event’})
db.eventlog.insert ( {description:’another dummy event’})
db.eventlog.isCapped()

Application Development Using Java :
MongoClient client = new MongoClient ();
MongoDatabase database = client.getDatabase(“dbversity”);
MongoCollection<Document> persons = database.getCollection(“persons”);
Document document = persons.find().first();
System.out.println(document.toJson());

MongoClient mongoClient = new MongoClient();
MongoClient mongoClient = new MongoClient( “localhost” );
MongoClient mongoClient = new MongoClient( “localhost” , 27017 );
MongoClient mongoClient = new MongoClient(Arrays.asList( new ServerAddress(“localhost”, 27017), new ServerAddress(“localhost”, 27018), new ServerAddress(“localhost”, 27019)));
MongoClientURI connectionString = new MongoClientURI(“mongodb://localhost:27017,” + “localhost:27018,localhost:27019”);
MongoClient mongoClient = new MongoClient(connectionString);
MongoClient

Represents a connection pool to the database, so you need only one instance of it even with multiple threads
To dispose off an instance, call MongoClient.close().

Document lang1 = new Document().append(“name”,”Kannada”).append(“proficiency”, “Fluent”);
Document lang2 = new Document(“name”,”Sanskrit”).append(“proficiency”, “Fluent”);
Document lang3 = new Document().append(“name”,”English”).append(“proficiency”, “Fluent”);

Object languages = Arrays.asList(lang1, lang2, lang3);

Document document = new Document().append(“name”, new Document (“first”,”Sampath”).append(“last”, “Kumar”)).append(“gender”, “M”)
.append(“yearOfBirth”, 1958).append(“livesIn”, “Mysore”).append(“countriesVisited”,Arrays.asList(“India”)).append(“languages”, languages);

persons.insertOne(document);
Inserting Multiple Documents :
List<Document> documents = new ArrayList <Document> ();

documents.add(new Document().append(“name”, new Document().append(“first”, “Shalini”).append(“last”, “Varadaraj”)));
documents.add(new Document().append(“name”, new Document().append(“first”, “Susanna”).append(“last”, “Weiss”)));

persons.insertMany(documents);

Counting documents :

long count = persons.count();
System.out.println(count);

Finding One Document with Filter :
Document query = new Document().append(“gender”, “F”);
Document result = persons.find(query).first();
System.out.println(result.toJson());

Finding Multiple Documents with Filter :

Document query =
new Document().append(“livesIn”, “Mumbai”);

MongoCursor <Document> cursor =
persons.find(query).iterator();

while (cursor.hasNext()) {
System.out.println(cursor.next().toJson());
}
cursor.close();
Query Builder :

QueryBuilder builder = QueryBuilder.start(“gender”).is(“F”).and(“yearOfBirth”).greaterThan(1980).lessThan(2000);

MongoCursor <Document> cursor = persons.find((Bson) builder.get()).iterator();
QueryBuilder builder = QueryBuilder.start(“gender”).is(“F”).and(“livesIn”).notEquals(“Siliguri”);

MongoCursor <Document> cursor = persons.find((Bson) builder.get()).iterator();

Field Selection :

Document query = new Document()
.append(“gender”, “F”)
.append(“livesIn”, new Document (“$ne”, “Siliguri”));

Document fields = new Document()
.append(“name.first”, true)
.append(“gender”, true)
.append(“livesIn”, true)
.append(“_id”, false);

MongoCursor <Document> cursor = persons.find(query).projection(fields).iterator();
Field Selection (Alternative Method)

Bson query = and (eq (“gender”, “F”),eq (“countriesVisited”, “United States of America”));
MongoCursor <Document> cursor = persons.find(query).projection(include(“name”, “gender”, “yearOfBirth”)).iterator();
Sort, Skip, Limit :

Document query = new Document().append(“gender”, “F”).append(“livesIn”, new Document (“$ne”, “Siliguri”));
Document fields = new Document().append(“name”, true).append(“gender”, true).append(“livesIn”, true).append(“_id”, false);
Document sortBy = new Document().append(“name.first”, 1).append(“name.last”, 1);
MongoCursor <Document> cursor = persons.find(query).projection(fields).sort(sortBy).skip(3).limit(3).iterator();
Another Way of Finding Documents :

import static com.mongodb.client.model.Filters.*;
persons.find (and(eq(“name.first”, “Anita”), eq(“name.last”, “Gogia”)),
See http://api.mongodb.org/java/3.0/?com/mongodb/client/model/Filters.html for details
Updating One Document :

persons.updateOne (and(eq(“name.first”,”Anita”), eq(“name.last”, “Gogia”)), new Document (“$set”, new Document (“livesIn”, “Shimla”)));

UpdateResult updateResult = persons.updateOne(and(eq(“name.first”, “Anita”), eq(“name.last”, “Gogia”)), new Document (“$set”, new Document (“livesIn”, “Shimla”)));

System.out.println(updateResult);
Updating multiple documents :

UpdateResult updateResult = persons.updateMany(and(eq(“name.first”, “Jenny”)),new Document (“$set”,new Document (“livesIn”, “Toronto”)));
System.out.println(updateResult);
System.out.println(updateResult.getModifiedCount());
UpdateResult updateResult = persons.updateOne
(and(eq(“name.first”, “John”),
eq(“name.last”, “Baptist”)),
new Document (“$set”,
new Document (“livesIn”, “Ajmer”)),
new UpdateOptions().upsert(true)
);
Deleting (zero or) one document:

DeleteResult deleteResult = persons.deleteOne
(and (eq (“name.first”, “John”),
eq (“name.last”, “Baptist”)));

Deleting multiple document:

DeleteResult deleteResult = persons.deleteMany
(eq (“married”, “No”));
Writing to GridFS :
FileInputStream inputStream = null;
String inputFile = “C:\\videos_can_be_deleted\\VTS_01_1.VOB”;
DB database = client.getDB(“dbversity”);
GridFS videos = new GridFS ((DB) database, “videos”);
try {
inputStream = new FileInputStream (inputFile );
} catch (FileNotFoundException e) {
System.out.println(“Unable to open file”);
System.exit (1);
}
GridFSInputFile video =
videos.createFile(inputStream, inputFile );
DBObject data = new BasicDBObject (
“description”, “IIMA-SJR participants on a morning walk”);
video.setMetaData(data);
video.save();
System.out.println(“Object ID in Files collection” +
video.get(“_id”));

Reading Back from GridFS
String outputFile = …
DB database = client.getDB(“dbversity”);
GridFS videos = new GridFS ((DB) database, “videos”);
GridFSDBFile file = videos.findOne (
new BasicDBObject (“filename”, outputFile));
FileOutputStream outputStream;
try {
outputStream = new FileOutputStream (
outputFile);
file.writeTo(outputStream);
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
show collections
db.videos.files.find().pretty()
db.videos.chunks.find( {}, {data:0})
For more details visit api.mongodb.org/java/current

  • Ask Question