[MongoDB]: Database locking
High lock or slow reads are often symptoms of other underlying issues.
The various root causes for these issues include, but are not limited to:
- Schema design
- Document Movement (I’ll focus on this topic in more detail below)
- Poor index strategies including:
- Over indexing
- Unindexed oversized arrays
- Indexed arrays with too many documents
- No indexes
- Inappropriate compound indexes (This blog post from one of our engineers is an excellent resource on this topic).
- Insufficient RAM for the working set requirements
- Increases or changes in the incoming workload (e.g. this may trigger the query planner to choose different non-optimal indexes)
- Hardware / OS / Configuration including:
- Inappropriate OS settings (e.g. not applying the production notes to a machine hosting a MongoDB service) (I’ll discuss this below)
- Co-locating multiple mongod instances to a physical host without appropriate resource management
- Insufficient storage layer provisioning (e.g. I/O overloaded at the disk level, disk failure/degradation)
- Inappropriate VM configuration (e.g. lack of sufficient memory reservations, not using available Paravirtualized I/O drivers)
- Inappropriate SAN strategy (e.g. not provisioning a dedicated storage group or running multiple members of the same replicaset on the same SAN)
I’ll provide an overview of two points, “Document Movement” and “Inappropriate OS settings”, in this ticket as an example but each of the other topics (storage / VM / indexing / schema / workload) are quite detailed and should be investigated individually on separate tickets.
Document Movement
best practices to suggest for anyone complaining about the database lock
In the case of “Document Movement” there are two application patterns to avoid. The application patterns relate to document growth and to document removal. If there is a specific locking issue or complaint from a user, you should raise a ticket to focus on the specific case.
Document growth
Document growth relates to how documents in MongoDB are stored contiguously on disk for rapid access. When a document is inserted into MongoDB, disk storage is allocated for that document. If that document is updated at a later point, it can potentially grow beyond the size it was originally allocated. This will cause the document to be moved to a new location with the old location being placed onto the free list. In the context of disk space management, document movement has the same end result as document removal, with frequent document movement potentially contributing to internal storage fragmentation. In 3.0, the locking is further improved by providing collection level locking with the mmapv1 storage engine or document level locking with the wired tiger storage engine.
If you confirm with your application developers that there are frequently growing documents, the best strategy is to have them pre-pad the documents when they are created.
http://docs.mongodb.org/manual/faq/developers/#faq-developers-manual-padding
Document removal
Document removal, occurs when a document is removed from a MongoDB collection, the space it formerly used is added to a free-list for later re-use. The algorithm used has known limitations that combined with certain patterns of document deletion and allocation sizes cause new disk space allocation instead of reuse from the storage from the free list. If these patterns continue for extended periods of time, they will contribute to the internal storage fragmentation for that collection. The best strategy in the case of frequent document deletion is to set the usePowerOf2Sizes flag for that collection.
http://docs.mongodb.org/manual/reference/command/collMod/#usePowerOf2Sizes
-
- Note: where your application routinely performs bulk deletion of data, you can avoid fragmentation by redesigning your scheme to allow you to perform a bulk delete by dropping a database. An example would be to store each day’s data in a separate collection in a separate database, and then drop this database when that day’s data is no longer required.
Inappropriate OS settings
taking too long with reads and collection having a huge read/write operations
In this case, can you confirm that these machines are running with the appropriate OS/hardware settings from our production notes? After confirming, this the next step would be to check the appropriate indexes exist.