Messing with MongoDB Shard Tags ?!! best practices
Question)
What will be the expected behavior when we do following 4 cases.
We know we should avoid this usage, but still wanted to know what if a user created it mistakenly and wanted to estimate the impact.
1) Creating tag ranges with overlapping with different Tags names.
2) Creating tag ranges with overlapping with same different tags names.
3) Creating same tag ranges with same tag names.
4) Creating same tag ranges with different tag names.
Answers :
1) Creating tag ranges with overlapping with different Tags names.
In cases where different tags have overlapping key ranges, it is possible for a single chunk to be “eligible” for membership in more than one Tag.
This does not appear to cause any exceptions in the database, however, the configuration is unsupported and the behaviour is undefined. Actual behavior may differ from one release (or patchset) to another, and changes to the behavior may be made without warning or notification.
Here is what you can reasonably expect might happen, however, MongoDB makes no representation about what actually will happen:
- When a new chunk is created with keys in an “overlapping” range (and therefore several Tags may be applicable) one tag will be arbitrarily selected, and the chunk will be assigned / migrated to a shard based on that tag.
- When rebalances occur, the “arbitrarily” selected tag may or may not remain the same.
- If a different tag is arbitrarily selected, the chunk may be migrated to a different shard.
- The possibility exists that a different tag will be selected on each rebalance; chunks whose keys fall into overlapping ranges may be constantly migrated from one shard to another.
2) Creating tag ranges with overlapping with same tags names.
Again, cases where tags have overlapping key ranges, are not permitted and not supported. The actual behavior is undocumented and may be unpredictable.
In this case, where all affected tags are the same, the effect will most likely be harmless. MongoDB will from time to time arbitrarily select one of several “eligible” tags, but since the tags are always the same, you should expect no behavior outside the ordinary.
3) Creating same key ranges with same tag names.
Again, this is technically not permitted, and it is not supported.
Using your own example of tags with overlapping ranges, I have defined the same tag with the same range several times, with the following results:
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> use config;
|
mongos> db.tags.find();
|
{ "_id" : { "ns" : "dbversitydb.mycol", "min" : { "id" : 1 } }, "ns" : "dbversitydb.mycol", "min" : { "id" : 1 }, "max" : { "id" : 150 }, "tag" : "TAG2" }
|
{ "_id" : { "ns" : "dbversitydb.mycol", "min" : { "id" : 150 } }, "ns" : "dbversitydb.mycol", "min" : { "id" : 150 }, "max" : { "id" : 200 }, "tag" : "TAG2" }
|
{ "_id" : { "ns" : "dbversitydb.mycol", "min" : { "id" : 0 } }, "ns" : "dbversitydb.mycol", "min" : { "id" : 0 }, "max" : { "id" : 300 }, "tag" : "TAG1" }
|
Here, we see that “TAG1” is actually present only once in the config.tags collection. That collection uses “ns” (namespace) and the lower-bound of the key range as the _id, so it is not possible for multiple documents to exist with the same namespace and lower-bound.
This is consistent with what we observed earlier when experimenting with “overlapping” key ranges – when the overlapping ranges happen to have the same lower bound, we observed that the newly refined range/Tag mapping replaces the prior one.
When defining the same tag multiple times with the same key ranges, the behavior is technically “undefined”, however, you can probably expect that the new tag definition will always replace the old one.
4) Creating same tag ranges with different tag names.
This actually works out to the same thing as your scenario #3.
Again, when using the same collection and the same key range to define multiple tags, we are technically defining overlapping ranges, and this is technicallynot supported and the behavior undefined.
When testing this (again, with your example), I can define the same key-range and tag several times and then introduce that same key-range with a different tag with these results:
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'TAG1')
|
mongos> sh.addTagRange('dbversitydb.mycol', { id: 0 }, { id: 300 }, 'ENTIRELY_NEW_TAG')
|
mongos> use config
|
switched to db config
|
mongos> db.tags.find();
|
{ "_id" : { "ns" : "dbversitydb.mycol", "min" : { "id" : 1 } }, "ns" : "dbversitydb.mycol", "min" : { "id" : 1 }, "max" : { "id" : 150 }, "tag" : "TAG2" }
|
{ "_id" : { "ns" : "dbversitydb.mycol", "min" : { "id" : 150 } }, "ns" : "dbversitydb.mycol", "min" : { "id" : 150 }, "max" : { "id" : 200 }, "tag" : "TAG2" }
|
{ "_id" : { "ns" : "dbversitydb.mycol", "min" : { "id" : 0 } }, "ns" : "dbversitydb.mycol", "min" : { "id" : 0 }, "max" : { "id" : 300 }, "tag" : "ENTIRELY_NEW_TAG" }
|
The end result is that TAG1 has disappeared entirely and been replaced with ENTIRELY_NEW_TAG.
This happens for exactly the same reasons described in scenario #3. While the behavior is undocumented and undefined, you can probably expect it to remain consistent for quite some time.
Summary
Of the 4 scenarios described, Scenario #1 is the only one where you are likely to encounter problems.
In no case should you define tags with overlapping key-ranges. In the special cases where your “overlapping” ranges share a common lower bound, you can probably expect that the new tag definition will simply replace the old one. (That may or may not be a desirable behavior, though.)