MongoDB: count() inconsistency during the MongoDB Initial Sharding
We observed that inconsistent document counts of a collection when we query thru mongos during initial sharding due to newly added shard to the existing set-up.
It’s expected to have inconsistent counts when we query directly on shard, but App user should always see the consistent data even when there’s auto-balancing/initial sharding.
Also, please confirm this inconsistency only for row counts due to move chunks or also with data as well.
Answer :
count() in a sharded environment is understandably not accurate as data will typically be inflight between shards.
In some cases there may also be orphan documents from failed migrations.
We are working on making this more accurate for sharded clusters (see SERVER-3645 and SERVER-8405).
https://jira.mongodb.org/browse/SERVER-3645
You should always query through the mongos rather than directly to a specific shard as the mongos will filter out orphan document in queries returned to your application.
I hope this explains the behaviour of count() in a sharded environment, if you wish you can track and upvote these server tickets.
This will help highlight the feature to our engineering team as it is one of the measures used to determine feature selection for inclusion in our releases.