[Elasticsearch ]: Performance Considerations for Indexing
Elasticsearch users have delightfully diverse use cases, ranging from appending tiny log-line documents to indexing Web-scale collections of large documents, and maximizing indexing throughput is often a common and important goal. While we try hard to set good general defaults…
[MongoDB]: Sharded cluster script
Shell script to run a 3 sharded cluster (3 ReplicaSet) on a single server. [root@dbversity.com bin]# cat shard_creation_script.sh ##### Killing the existing Mongo processes ################ for i in `ps -ef | egrep ‘shardsvr|configsvr|replSet|configdb’ | grep -v egrep | awk -F”…
[MongoDB-Hadoop]: Connectivity testing
Install Hadoop CDH 4/5, Hive, Pig, Java & MongoDB and set environment variables as below. [root@dbversity.com ~]# cat ~/.bashrc | grep “export” export PATH=$PATH:/opt/mongodb/bin export JAVA_HOME=/usr/java/jdk1.8.0_05 export PATH=$JAVA_HOME/bin:$PATH export HADOOP_HOME=/hadoop export PATH=$HADOOP_HOME/bin:$PATH export HIVE_HOME=/hadoop/hive-0.12.0-cdh5.0.0 export PATH=$HIVE_HOME/bin:$PATH export PIG_HOME=/hadoop/pig-0.12.0-cdh5.0.0 export PATH=$PIG_HOME/bin:$PATH…
[MongoDB]: _tmp significance in MongoDB dbpath
What is the significance of _tmp directory is used for in MongoDB and why it gets created? There are at least two ways from MongoDB to create a _tmp directory. INITIAL FILE ALLOCATION : – ======================== During the file allocation,…
[MongoDB] : How Indexes works !
> use dbversitydb switched to db dbversitydb > > > for(i = 1; i <= 1000; i++) db.dbversity_website.insert( { post_id : i, comment_id : i, likes : i}); WriteResult({ “nInserted” : 1 }) > > > > db.dbversity_website.findOne() { “_id”…
[MongoDB]: dropDups to remove duplicate records
MongoDB dropDups option in Index creation will be useful to remove duplicate records that already exist in the collection. > use foo switched to db foo > > > > db.test.insert({ a:1 , b:1, c:1 }) WriteResult({ “nInserted” : 1…
[MongoDB]: Unique Index
MongoDB allows you to specify a unique constraint on an index. These constraints prevent applications from inserting documents that have duplicate values for the inserted fields. > db.dbversity.createIndex( { “dbversity.post_id” : 1 } , { unique : true } )…
[MongoDB]: TTL (Time To Live) collections and Indexes
TTL (Time To Live) indexes are special single-field indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time. Data expiration is useful for certain types of information like machine generated event data,…
[MongoDB]: DBRef (Database references) usage
DBRefs vs Manual References As an example scenario where we would use DBRefs instead of Manual References, consider a database where we are storing different types of addresses (home, office, mailing, etc) in different collections (address_home, address_office, address_mailing, etc). Now,…