MongoDB Balancer Tips …
- Run the balancer at low traffic times
mongos> db.settings.update({_id: “balancer”}, {$set : { activeWindow: { start: “9:00”, stop: “21:00”}}})
mongos>
mongos>
mongos> db.settings.find()
{ “_id” : “chunksize”, “value” : 64 }
{ “_id” : “balancer”, “activeWindow” : { “start” : “9:00”, “stop” : “21:00” }, “stopped” : false }
mongos> - Can be triggered manually using moveChunk.
If we perform these maintenance windows regularly (e.g. once per week) and our collections are sharded on appropriate keys, it is less likely that the data will have become significantly unbalanced since the previous maintenance. In this case, the balancer should not take long to complete its work.
Running the balancer before the bulk-operations may improve the latter’s performance, by increasing the likelihood that writes are distributed across the shards. However, this is heavily dependent on the exact content of the bulk-operations. If, for instance, we perform a large number of inserts in a relatively small shard-key range, the write load will fall on a limited subset of shards. In addition, these shards will likely become unbalanced as a result, requiring the balancer to intervene.
The point at which we take a backup depends on what method we are using to do so, when we most recent backup dates from, and whether we wish to take a backup of the data before the bulk-operations for safety.
If it is feasible for our use-case, a good option would be to use Cloud or Ops Manager’s Backup facility.
Because this provides effectively continuousbackup of your cluster, it does not require a maintenance window, downtime, or manual intervention, and therefore removes Backup from the list of tasks you must consider here.The ideal sequence of maintenance tasks might be as follows:
- Take a backup before performing any further maintenance operations
- Perform the bulk-operations on the cluster
- Run the balancer and allow it to redistribute the data
- Take a backup of the latest data
In short, take a backup, perform bulk-ops on the data, allow the balancer to fix any resulting imbalances