MongoDB Fragmentation best practices.

Q – Does data file fragmentation in MongoDB reduce the amount of data cached in memory

Yes, the memory mapping approach used by MongoDB maps files directly to memory. If there is fragmentation in these files, this will be mapped to memory and reduce the amount of data cached in memory.

Q – Long running database with out of place updates, when is the appropriate time and method/s to address fragmentation

The best time to address fragmentation is during a schedule maintenance window or if that is not possible at a time of low usage. I will cover the methods to address fragmentation separately below but there is one additional consideration, if you chose the re-syncing a member approach you should factor in networking bandwidth as this be required when moving a large amount of data between the member’s involved.

Background causes for increased fragmentation

It is useful to understand the two application patterns that cause increased fragmentation, document growth and document removal.

  1. Document growth relate to how documents in MongoDB are stored contiguously on disk for rapid access. When a document is inserted into MongoDB, disk storage is allocated for that document. If that document is updated at a later point, it can potentially grow beyond the size it was originally allocated. This will cause the document to be moved completely to a new location with the old location being placed onto the free list. In the context of disk space management, document movement has the same end result as document removal, with frequent document movement potentially contributing to internal storage fragmentation. If you confirm with your application developers that there are frequently growing documents, the best strategy is to have them pre-pad the documents when they are created.
    http://docs.mongodb.org/manual/faq/developers/#faq-developers-manual-padding
  1. Document removal, when a document is removed from a MongoDB collection, the space it formerly used is added to a free-list for later re-use. The algorithm used has known limitations that combined with certain patterns of document deletion and allocation sizes cause new disk space allocation instead of reuse from the storage from the free list. If these patterns continue for extended periods of time, they will contribute to the internal storage fragmentation for that collection. The best strategies in the case of frequent document deletion is to set the usePowerOf2Sizes flag for that collection.
    http://docs.mongodb.org/manual/reference/command/collMod/#usePowerOf2Sizes

    • Note: where your application routinely performs bulk deletion of data, you can avoid fragmentation by redesigning your scheme to allow you to perform a bulk delete by dropping a database. An example would be to store each day’s data in a separate collection in a separate database, and then drop this database when that day’s data is no longer required.

Procedures for detecting fragmentation

There are a number of high-level indicators that signal fragmentation including:

  1. A disk utilisation differences of greater than 20% between nodes in a replica set
  2. db.stats() output highlights a excessive disk utilisation (see http://docs.mongodb.org/manual/reference/method/db.stats/ for details on this method)

There are two levels of granularity to look at for these scenarios, the first will be at a node level and the second is at replica set level. Examining node level fragmentation, you should use the following procedure:

  1. Run the db.stats() command, ignore the output of the ‘local’, ‘config’, and ‘admin’ databases when checking for fragmentation. These are not relevant for this diagnostic procedure.
  2. Examining the output of db.stats(), look at the ‘dataSize’ value in comparison to the expect amount of data in your database. A high value in this field would indicate you are storing more data than expected (see http://docs.mongodb.org/manual/reference/command/dbStats/#dbStats.dataSize for more details on ‘dataSize’).
  3. Looking at the output again, examine the ‘indexSize’ value, if this value is equal or greater than the value for ‘dataSize’ then there is an issue with your indexing strategy. Check with your application developers about their schema design and consider refactoring your schema (see http://docs.mongodb.org/manual/reference/command/dbStats/#dbStats.indexSize for more details on ‘indexSize’).
    • If the value of ‘indexSize’ is between 25% to 100% of the ‘dataSize’ value, this also indicates a review with your application developers on your indexing strategy is appropriate. A schema redesign may or may not be appropriate at this point, this is specific to the data access patterns and application design.
  4. Looking again at the db.stats() output and at the ‘fileSize’ value, where this is less than <6 GB for any specific database, it is unlikely that defragmentation will help. Below 6 GB, the MongoDB ‘rule of thumb’ / guide calculations are not likely to be accurate due to the fixed overhead approach it uses.
  5. Examining the fields ‘storageSize’ and ‘dataSize’ from db.stats() will always show that ‘storageSize’ will be larger than ‘dataSize’ due to the internal padding used by MongoDB, however when this is significantly larger than ‘dataSize’ there is internal fragmentation.
    • When you are using ‘usePowerOf2Sizes‘ on any collection in your database, this check does not apply. You should note that from version 2.5.5 this is the standard allocation strategy for new collections.
    • When you pre-allocate records in any collection in your database to prevent document movement, this check also does not apply.
  6. Sum the fields ‘storageSize’ and ‘indexSize’ value from the db.stats() output and:
    • If the sum is less than ~75% of the value of ‘fileSize’, then you are likely to have internal fragmentation; or,
    • If the sum is greater than the value of ‘fileSize’ plus 7GB, then you are likely to have internal fragmentation.
    • NOTE: These are approximate guidelines / rules of thumb, and are not to be taken as absolute / hard-and-fast rules

Examining replica set level fragmentation for one replica set, you should use the following procedure:

  1. Collect the output of the db.stats() command for all the databases on all the nodes in the specific replica set. These commands should be run as close together as possible in time to improve the metrics for your fragmentation assessment.
    • On a single node, the following code will extract the necessary statistics for all databases on that node:

o     o    db._adminCommand(“listDatabases”).databases.forEach(function (d) {o         mdb = db.getSiblingDB(d.name);o         printjson(d.name);o         printjson(mdb.stats());o    })

    • The key field to explore in this output is the ‘objects’ field. If there is a significant variance between members in the replica set, stop and log a ticket in Jira with MongoDB Support. Otherwise, you can proceed with each node using the single node procedure outlined above.

Fragmentation Analysis – Triage according to these guidelines

The three outcomes of the fragmentation can be classified as follows with our recommended approach to triage.

  1. Higher than expected data size – if the total data size is higher than expected, this is an issue that requires a separate diagnosis outside of internal fragmentation. If assistance if required, please contact MongoDB support by raising a ticket in Jira.
  2. Higher than expected index size – if the total index size is higher than expected, this is an issue that requires a separate diagnosis outside of internal fragmentation. This situation, is typically due to poor schema design and resolving this is outside of the issue of fragmentation.
  3. Internal fragmentation – this is the only outcome when you should consider the procedures for addressing fragmentation.

Procedures for addressing fragmentation

There are four strategies to address fragmentation, with advantage and disadvantages, they are:

  1. Resync a node from another replica set member – http://docs.mongodb.org/manual/tutorial/resync-replica-set-member/
  2. Run the compact() command on the collection – http://docs.mongodb.org/manual/reference/command/compact/
  3. Run the db.repairDatabase() command on the database – http://docs.mongodb.org/manual/reference/command/repairDatabase/#dbcmd.repairDatabase
  4. Run the –repair command on all the database on the node – http://docs.mongodb.org/manual/reference/program/mongod/#cmdoption–repair

NOTE: These methods will make the node unavailable for normal use. These methods should be performed in a ‘rolling’ fashion, where only a single node is removed from server at a time.

NOTE: If you have deployed your replica sets with the recommended minimum of three data-bearing nodes, you can perform maintenance on one of them without having to take downtime for the entire replica set.

In order to understand the four options of reducing fragmentation, I’ll briefly highlight the advantages and disadvantages of each of the options.

Each of these four strategies has advantages and disadvantages, as described below.

Resync a node from another replica set member

Advantages:

  • Does not require any additional disk space
  • Requires the least amount of administrative work
  • Returns the unused space back to the operating system
  • Removes fragmentation from all collections and databases

Disadvantages:

  • Can be time consuming if total data to be transferred is large and/or the network is slow
  • Places an additional burden/load on the other replica set members
  • Performs unnecessary work if only one collection is fragmented
Run the compact() command on the collection

Advantages:

  • Requires 2GB of additional disk space
  • Will only defragment a single collection
  • Allows you to set the paddingFactor and paddingBytes for that specific collection
  • Makes fragmented space available for re-use by all collections in that specific database

Disadvantages:

  • Does not return freed space (including the additional 2GB) to the operating system
  • The measures / methods of detecting fragmentation will not be changed immediately after running compact()
  • Only defragments a single collection
Run the db.repairDatabase() command on the database

Advantages:

  • Will return the unused space to the operating system
  • Defragments all collections in the specified database

Disadvantages:

  • Requires temporary disk space equal to the size of the database being repaired plus an additional 2 GB of disk space (the size of your DB plus 2 GB)
  • The temporary free space required in this operation must be available in the –dbpath directory, you are unable to specify the location of this temporary disk space
Run the –repair command on all the database on the node

Advantages:

  • Returns unused space to the operating system
  • Defragments all collections in all databases
  • Can specify location of temporary storage using the –repairpath option

Disadvantages:

  • Requires temporary disk space equal to the size of the database being repaired plus an additional 2 GB of disk space (the size of your DB plus 2 GB)

General strategies

The appropriate strategy will depend on the specific circumstances, however the general guidelines are as follows:

  1. Resyncing the node is often best and easiest if the resync time is reasonable
  2. Otherwise, use –repair with –repairpath if you want to return the maximum amount of space to the operating system
  3. Use compact() only if you have a single problematic collection and adequate free space in the file system
    It should be noted that with each strategy indexes will need to be rebuilt from scratch. Hence large indexes (or large numbers of indexes) can cause each of the solutions to take an extended time to complete.

Good practice – routine maintenance

It is a recommended that you perform maintenance on a regular basis, the frequency of which depends on how fast your databases fragment. For a typical installation, once per month is normally sufficient. This should be done as noted above in a rolling fashion, preferably during a maintenance window or otherwise at a time of low use.

Summary

In this note, I have addressed your broad questions around data file fragmentation and when the appropriate time to de-fragment is. A short related outline of how document removal and document growth can impact fragmentation is given to help understand the root causes. The procedures for detecting fragmentation at a node level and at a replica set level are outlined. This note closes with an overview of the procedures to remove fragmentation discussing their advantages and disadvantages as well as recommendations for routine maintenance.

Next steps

It would be best to determine which of the removal strategies is most suitable for your own use cases (whether it is re-syncing, compaction or repair) and then to follow up on these specifically with a call.

  • Ask Question