Wrong sharding on mongoDB 3.0

Posted on

Question :

I have a mongos with 2 shard servers (and 3 config servers)

sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("55ae2ddab9e8641e489ce39b")
}
  shards:
        {  "_id" : "shard0000",  "host" : "192.168.212.182:27018" }
        {  "_id" : "shard0001",  "host" : "192.168.212.106:27018" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "Vision_1282",  "partitioned" : true,  "primary" : "shard0001" }
                Vision_1282.Group
                        shard key: { "groepNr" : 1, "valId" : 1 }
                        chunks:
                                shard0001       1
                        { "groepNr" : { "$minKey" : 1 }, "valId" : { "$minKey" : 1 } } -->> { "groepNr" : { "$maxKey" : 1 }, "valId" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 0)
        {  "_id" : "db",  "partitioned" : false,  "primary" : "shard0000" }

I made the shard key as follows:

db.runCommand({ shardcollection : "Vision_1282.Group", key : {groepNr : 1, valId : 1}})

I insert about 10.000 records, where the groepNr is 1, 2, 3 or 4 and a valId can be between 1 and 4001.

When I look at the shardDistribution I see that shard001 only has records where groepNr = 1 and valId = 1. All other records are in shard002.

I thought that the value behind the key value (here groepNr and valId) is 1 for ascending and -1 for descending.

PS. in PHP code, where I do the inserts I also ensure to use the index.
$collection->ensureIndex(array("groepNr" => 1 , "valId" => 1));

Did I do something wrong with creating the shardkey?

Answer :

I have lowered the chunksize to 1 MB:

 { "_id" : "chunksize", "value" : 1 }

I inserted now about 10.000 documents.

If I look at the sh.status(), I see this:

Shard shard0000 at 192.168.212.182:27018
 data : 1.33Mb docs : 5761 chunks : 4
 estimated data per chunk : 343kb
 estimated docs per chunk : 1440

Shard shard0001 at 192.168.212.106:27018
 data : 1017kb docs : 4249 chunks : 3
 estimated data per chunk : 339kb
 estimated docs per chunk : 1416

Totals
 data : 2.33Mb docs : 10010 chunks : 7
 Shard shard0000 contains 57.41% data, 57.55% docs in cluster, avg obj size on shard : 243b
 Shard shard0001 contains 42.58% data, 42.44% docs in cluster, avg obj size on shard : 245b

Also I run the scripts AllChunkInfo(“Vision_1282.Group”), with this result:

mongos> AllChunkInfo("Vision_1282.Group")
ChunkID,Shard,ChunkSize,ObjectsInChunk
Vision_1282.Group-groepNr_MinKeyvalId_MinKey,shard0001,83504,301
Vision_1282.Group-groepNr_"1"valId_"2000",shard0001,1920,8
Vision_1282.Group-groepNr_"1"valId_"2002",shard0001,974144,4012
Vision_1282.Group-groepNr_"2"valId_"1",shard0000,524384,2138
Vision_1282.Group-groepNr_"3"valId_"4",shard0000,498704,2031
Vision_1282.Group-groepNr_"4"valId_"4",shard0000,292800,1220
Vision_1282.Group-groepNr_"4"valId_"6",shard0000,113520,473
***********Summary Chunk Information***********
Total Chunks: 7
Average Chunk Size (bytes): 355568
Empty Chunks: 0
Average Chunk Size (non-empty): 355568

So you are right, the records were to small and the chunksize was to big.

Tx.

No need to do the ensureIndex in php. When sharding your collection, the index gets created automatically. Mostly I try to create the index before the shardCollection, just a habit.

First make sure you don’t have Vision_1282 databases on your shards themselves:

shardA:PRIMARY> show dbs
local    6.075GB
shardA:PRIMARY> exit
bye

shardB:PRIMARY> show dbs
local    6.075GB
shardB:PRIMARY> exit
bye

Then I did the following steps:

(using a great script from Adam Comerford to check size/count/place of chunks : https://github.com/comerford/mongodb-scripts)

$ mongo --shell allChunkInfo.js
MongoDB shell version: 3.0.3
connecting to: test
type "help" for help
mongos> show dbs
admin    (empty)
config   0.016GB
mongos> sh.enableSharding("Vision_1282")
{ "ok" : 1 }
mongos> use Vision_1282
switched to db Vision_1282
mongos> db.Group.createIndex({groepNr : 1, valId : 1})
{
    "raw" : {
        "shardB/mongo2-2-1:27018,mongo2-2-2:27018" : {
            "createdCollectionAutomatically" : true,
            "numIndexesBefore" : 1,
            "numIndexesAfter" : 2,
            "ok" : 1,
            "$gleStats" : {
                "lastOpTime" : Timestamp(1437540334, 5),
                "electionId" : ObjectId("55a72087bc8dee49cc03fce5")
            }
        }
    },
    "ok" : 1
}
mongos> use admin
switched to db admin
mongos> db.runCommand({ shardcollection : "Vision_1282.Group", key : {groepNr : 1, valId : 1}})
{ "collectionsharded" : "Vision_1282.Group", "ok" : 1 }
mongos> use Vision_1282
switched to db Vision_1282
mongos> db.Group.insert({groepNr:1, valId:3})
WriteResult({ "nInserted" : 1 })
mongos> db.Group.find()
{ "_id" : ObjectId("55af205a0eb443db0213e359"), "groepNr" : 1, "valId" : 3 }
mongos> AllChunkInfo("Vision_1282.Group")
ChunkID,Shard,ChunkSize,ObjectsInChunk
Vision_1282.Group-groepNr_MinKeyvalId_MinKey,shardB,112,1
***********Summary Chunk Information***********
Total Chunks: 1
Average Chunk Size (bytes): 112
Empty Chunks: 0
Average Chunk Size (non-empty): 112
mongos> db.Group.remove({})
WriteResult({ "nRemoved" : 1 })
mongos> db.Group.count()
0

Using a python script to import 10.000 records, then the result was indeed 9999 records on 1 shard and 1 document in the other.

mongos> AllChunkInfo("Vision_1282.Group")
ChunkID,Shard,ChunkSize,ObjectsInChunk
Vision_1282.Group-groepNr_MinKeyvalId_MinKey,shardA,48,1
Vision_1282.Group-groepNr_1valId_2,shardB,360128,7502
Vision_1282.Group-groepNr_4valId_5,shardB,119872,2497
***********Summary Chunk Information***********
Total Chunks: 3
Average Chunk Size (bytes): 160016
Empty Chunks: 0
Average Chunk Size (non-empty): 160016

But… after inserting bigger records with more data… and adding a lot more records … and patience … finally you can see the sharding in action:

mongos> AllChunkInfo("Vision_1282.Group")
ChunkID,Shard,ChunkSize,ObjectsInChunk
Vision_1282.Group-groepNr_MinKeyvalId_MinKey,shardA,576,8
Vision_1282.Group-groepNr_1valId_2,shardA,10490992,106806
Vision_1282.Group-groepNr_2valId_4455,shardB,11521744,110015
Vision_1282.Group-groepNr_3valId_14462,shardB,4757472,42486
Vision_1282.Group-groepNr_4valId_5,shardB,8998288,87476
***********Summary Chunk Information***********
Total Chunks: 5
Average Chunk Size (bytes): 7153814.4
Empty Chunks: 0
Average Chunk Size (non-empty): 7153814.4

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("55a754a9ac01a84a90b137cf")
}
  shards:
    {  "_id" : "shardA",  "host" : "shardA/mongo1:27018,mongo2:27018" }
    {  "_id" : "shardB",  "host" : "shardB/mongo3:27018,mongo4:27018" }
  balancer:
    Currently enabled:  yes
    Currently running:  yes
        Balancer lock taken at Wed Jul 22 2015 05:15:40 GMT+0000 (UTC) by ac2-1:27017:1437029544:1804289383:Balancer:1681692777
    Failed balancer rounds in last 5 attempts:  0
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }

    {  "_id" : "Vision_1282",  "partitioned" : true,  "primary" : "shardB" }
        Vision_1282.Group
            shard key: { "groepNr" : 1, "valId" : 1 }
            chunks:
                shardA  2
                shardB  3
            { "groepNr" : { "$minKey" : 1 }, "valId" : { "$minKey" : 1 } } -->> { "groepNr" : 1, "valId" : 2 } on : shardA Timestamp(2, 0) 
            { "groepNr" : 1, "valId" : 2 } -->> { "groepNr" : 2, "valId" : 4455 } on : shardA Timestamp(3, 0) 
            { "groepNr" : 2, "valId" : 4455 } -->> { "groepNr" : 3, "valId" : 14462 } on : shardB Timestamp(3, 1) 
            { "groepNr" : 3, "valId" : 14462 } -->> { "groepNr" : 4, "valId" : 5 } on : shardB Timestamp(2, 4) 
            { "groepNr" : 4, "valId" : 5 } -->> { "groepNr" : { "$maxKey" : 1 }, "valId" : { "$maxKey" : 1 } } on : shardB Timestamp(1, 3) 

Conclusion: because the records were too small, and the chunk was not big enough to split yet.

To see the configuration of the size of the chunks:

mongos> use config
switched to db config
mongos> db.settings.find()
{ "_id" : "chunksize", "value" : 64 }

Leave a Reply

Your email address will not be published. Required fields are marked *