Question :
The following count
query is taking lot more time than expected though index is being used and keysExamined
, docsExamined
are also less
query = {'indxed_field': 'x',
'a': 'test',
'b': {'$ne': True},
'c': {'$gte': 100},
'd': 2}
the query time is varying from 10-30 secs for different values mentioned there.Following is the explain output of same query with find
(to get explain)
{u'$clusterTime': {u'clusterTime': Timestamp(1554365186, 14),
u'signature': {u'hash': Binary('xxxxxxxxxxxxxx', 0),
u'keyId': 0L}},
u'executionStats': {u'allPlansExecution': [],
u'executionStages': {u'advanced': 16917,
u'alreadyHasObj': 0,
u'docsExamined': 26218,
u'executionTimeMillisEstimate': 11263,
u'filter': {u'$and': [{u'a': {u'$eq': u'test'}},
{u'b': {u'$eq': 2}},
{u'c': {u'$gte': 100}},
{u'$nor': [{u'd': {u'$eq': True}}]}]},
u'inputStage': {u'advanced': 26218,
u'direction': u'forward',
u'dupsDropped': 0,
u'dupsTested': 0,
u'executionTimeMillisEstimate': 40,
u'indexBounds': {u'indxed_field': [u'["x", "y"]']},
u'indexName': u'indxed_field_1',
u'indexVersion': 1,
u'invalidates': 0,
u'isEOF': 1,
u'isMultiKey': False,
u'isPartial': False,
u'isSparse': False,
u'isUnique': False,
u'keyPattern': {u'indxed_field': 1},
u'keysExamined': 26218,
u'multiKeyPaths': {u'indxed_field': []},
u'nReturned': 26218,
u'needTime': 0,
u'needYield': 0,
u'restoreState': 567,
u'saveState': 567,
u'seeks': 1,
u'seenInvalidated': 0,
u'stage': u'IXSCAN',
u'works': 26219},
u'invalidates': 0,
u'isEOF': 1,
u'nReturned': 16917,
u'needTime': 9301,
u'needYield': 0,
u'restoreState': 567,
u'saveState': 567,
u'stage': u'FETCH',
u'works': 26219},
u'executionSuccess': True,
u'executionTimeMillis': 11271,
u'nReturned': 16917,
u'totalDocsExamined': 26218,
u'totalKeysExamined': 26218},
u'ok': 1.0,
u'operationTime': Timestamp(1554365186, 14),
u'queryPlanner': {u'indexFilterSet': False,
u'namespace': u'xxxxxxxxxxxx',
u'parsedQuery': {u'$and': [{u'indxed_field': {u'$eq': u'x'}},
{u'a': {u'$eq': u'test'}},
{u'b': {u'$eq': 2}},
{u'c': {u'$gte': 100}},
{u'$nor': [{u'd': {u'$eq': True}}]}]},
u'plannerVersion': 1,
u'rejectedPlans': [],
u'winningPlan': {u'filter': {u'$and': [{u'a': {u'$eq': u'test'}},
{u'b': {u'$eq': 2}},
{u'c': {u'$gte': 100}},
{u'$nor': [{u'd': {u'$eq': True}}]}]},
u'inputStage': {u'direction': u'forward',
u'indexBounds': {u'indxed_field': [u'["x", "y"]']},
u'indexName': u'indxed_field_1',
u'indexVersion': 1,
u'isMultiKey': False,
u'isPartial':
False,
u'isSparse': False,
u'isUnique': False,
u'keyPattern': {u'indxed_field': 1},
u'multiKeyPaths': {u'indxed_field': []},
u'stage': u'IXSCAN'},
u'stage': u'FETCH'}},
u'serverInfo': {u'gitVersion': u'xxxxxxxxxxxxxxxxxxxxxxxx',
u'host': u'xxxxxxxxxxxxxxxxxxxxx',
u'port': 27017,
u'version': u'x.x.x'}}
And I did a count query in one shell and collected the db.currentOp() output in another shell, that is mentioned below
{
"inprog" : [
{
"host" : "xxxxxxxxxx",
"desc" : "conn69767",
"connectionId" : 69767,
"client" : "127.0.0.1:34876",
"appName" : "MongoDB Shell",
"clientMetadata" : {
"application" : {
"name" : "MongoDB Shell"
},
"driver" : {
"name" : "MongoDB Internal Client",
"version" : "4.0.2"
},
"os" : {
"type" : "Linux",
"name" : "Ubuntu",
"architecture" : "x86_64",
"version" : "16.04"
}
},
"active" : true,
"currentOpTime" : "2019-04-05T11:56:58.102+0000",
"opid" : 13776192,
"lsid" : {
"id" : UUID("xxxxxxxxxxxx"),
"uid" : BinData(0,"xxxxxxxxx")
},
"secs_running" : NumberLong(5),
"microsecs_running" : NumberLong(5141567),
"op" : "command",
"ns" : "my_collection",
"command" : {
"count" : "my_collection",
"query" : {
"indxed_field" : "x",
"a" : "test",
"b" : {
"$ne" : true
},
"c" : {
"$gte" : 190
},
"d" : 2,
"is_deleted" : {
"$ne" : true
}
},
"fields" : {
},
"lsid" : {
"id" : UUID("xxxxxxxxxxx")
},
"$clusterTime" : {
"clusterTime" : Timestamp(1554465406, 5),
"signature" : {
"hash" : BinData(0,"xxxxxxxxx"),
"keyId" : NumberLong(0)
}
},
"$readPreference" : {
"mode" : "secondaryPreferred"
},
"$db" : "mydb"
},
"planSummary" : "IXSCAN { indxed_field: 1 }",
"numYields" : 260,
"locks" : {
"Global" : "r",
"Database" : "r",
"Collection" : "r"
},
"waitingForLock" : false,
"lockStats" : {
"Global" : {
"acquireCount" : {
"r" : NumberLong(261)
}
},
"Database" : {
"acquireCount" : {
"r" : NumberLong(261)
}
},
"Collection" : {
"acquireCount" : {
"r" : NumberLong(261)
}
}
}
}
],
"ok" : 1,
"operationTime" : Timestamp(1554465418, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1554465418, 1),
"signature" : {
"hash" : BinData(0,"xxxxxxxx"),
"keyId" : NumberLong(0)
}
}
}
I was bit unclear at lockStats
section. Any Ideas why the query is taking more time.
Answer :
Mongo count query taking long time though index is being used
As per MongoDB documentation here To return information on query plans and execution statistics of the query plans, MongoDB provides:
- the db.collection.explain()
method, - the
cursor.explain() method, and - the
explain command.
The explain results present the query plans as a tree of stages.
"winningPlan" : {
"stage" : <STAGE1>,
...
"inputStage" : {
"stage" : <STAGE2>,
...
"inputStage" : {
"stage" : <STAGE3>,
...
}
}
},
Each stage passes its results (i.e. documents
or index keys
) to the parent node. The leaf nodes access the collection or the indices. The internal nodes manipulate the documents or the index keys that result from the child nodes. The root node is the final stage from which MongoDB derives the result set.
Stages are descriptive of the operation; e.g.
- COLLSCAN for a collection scan
- IXSCAN for scanning
index keys
- FETCH for retrieving documents
- SHARD_MERGE for merging results from shards
- SHARDING_FILTER for filtering out orphan documents from shards
explain.executionStats.executionStages.inputStage.keysExamined
For query execution stages that scan an index (e.g. IXSCAN), keysExamined is the total number of in-bounds and out-of-bounds keys that are examined in the process of the index scan. If the index scan consists of a single contiguous range of keys, only in-bounds keys need to be examined. If the index bounds consists of several key ranges, the index scan execution process may examine out-of-bounds keys in order to skip from the end of one range to the beginning of the next.
Consider the following example, where there is an index
of field x
and the collection contains 100
documents with x values 1 through 100
:
db.keys.find( { x : { $in : [ 3, 4, 50, 74, 75, 90 ] } } ).explain( "executionStats" )
The query will scan keys 3 and 4. It will then scan the key 5, detect that it is out-of-bounds, and skip to the next key 50.
Continuing this process, the query scans keys 3
, 4
, 5
, 50
, 51
, 74
, 75
, 76
, 90
, and 91
. Keys 5
, 51
, 76
, and 91
are out-of-bounds keys that are still examined. The value of keysExamined
is 10
.
Note: New in version 3.4: For index scan (IXSCAN
) stages only. The number of times that we had to seek the index cursor to a new position in order to complete the index scan.
executionStats
The returned executionStats information details the execution of the winning plan. In order to include executionStats in the results, you must run the explain in either:
- executionStats
or - allPlansExecution
verbosity mode. Use allPlansExecution mode to include partial
execution data captured during plan
selection.
For unsharded collections, explain returns the following executionStats
information:
"executionStats" : {
"executionSuccess" : <boolean>,
"nReturned" : <int>,
"executionTimeMillis" : <int>,
"totalKeysExamined" : <int>,
"totalDocsExamined" : <int>,
"executionStages" : {
"stage" : <STAGE1>
"nReturned" : <int>,
"executionTimeMillisEstimate" : <int>,
"works" : <int>,
"advanced" : <int>,
"needTime" : <int>,
"needYield" : <int>,
"saveState" : <int>,
"restoreState" : <int>,
"isEOF" : <boolean>,
...
"inputStage" : {
"stage" : <STAGE2>,
"nReturned" : <int>,
"executionTimeMillisEstimate" : <int>,
...
"inputStage" : {
...
}
}
},
"allPlansExecution" : [
{
"nReturned" : <int>,
"executionTimeMillisEstimate" : <int>,
"totalKeysExamined" : <int>,
"totalDocsExamined" :<int>,
"executionStages" : {
"stage" : <STAGEA>,
"nReturned" : <int>,
"executionTimeMillisEstimate" : <int>,
...
"inputStage" : {
"stage" : <STAGEB>,
...
"inputStage" : {
...
}
}
}
},
...
]
}
When an index covers a query, MongoDB can both match the query conditions and return the results using only the index keys; i.e. MongoDB does not need to examine documents from the collection to return the results.
When an index covers a query, the explain result has an IXSCAN stage that is not a descendant of a FETCH stage, and in the executionStats, the totalDocsExamined
is 0.
'nReturned': 16917
'totalDocsExamined': 26218
'totalKeysExamined': 26218
26218
records are filtered using the index key, but the final result returns only 16917
. In this case, add few more fields to your index as shown below
{ indxed_field_1 : 1, 'a' : 1, 'b' : 1, 'd': 1, c: 1 } --> This will really improve your performance.
Also, instead of using 'b': {'$ne': True}
, use 'b': Fasle
. $ne
is always is resource consuming operation.
So your final query will look like below.
query = {'indxed_field': 'x',
'a': 'test',
'b': False,
'c': {'$gte': 100},
'd': 2}
Additional Notes:
If you feel you don’t want to include all the fields in the index, the below combination also may improve performance.
{ indxed_field_1 : 1, c: 1 }