MongoDB查询优化：从 10s 到 10ms

xiaoxiao2026-04-09 31

本文是我前同事付秋雷最近遇到到一个关于MongoDB执行计划选择的问题，非常有意思，在探索源码之后，他将整个问题搞明白并整理分享出来。付秋雷(他的博客）曾是 Tair（阿里内部用得非常广泛的KV存储系统）的核心开发成员，目前就职于蘑菇街。

背景

苏先生反馈线上某条查询很慢（10+ seconds），语句相当于

db.myColl.find({app:"my_app",requestTime:{$gte:1492502247000,$lt:1492588800000}}).sort({_id:-1}).limit(1)

myColl这个collection中的记录内容类似于：

{ "_id" : ObjectId("58fd895359cb8757d493ce60"), "app" : "my_app", "eventId" : 141761066, "requestTime" : NumberLong("1493010771753"), "scene" : "scene01" } { "_id" : ObjectId("58fd895359cb8757d493ce52"), "app" : "my_app", "eventId" : 141761052, "requestTime" : NumberLong("1493010771528"), "scene" : "scene02" } { "_id" : ObjectId("58fd895359cb8757d493ce36"), "app" : "my_app", "eventId" : 141761024, "requestTime" : NumberLong("1493010771348"), "scene" : "scene03" } { "_id" : ObjectId("58fd895359cb8757d493ce31"), "app" : "my_app", "eventId" : 141761019, "requestTime" : NumberLong("1493010771303"), "scene" : "scene01" } { "_id" : ObjectId("58fd895359cb8757d493ce2d"), "app" : "my_app", "eventId" : 141761015, "requestTime" : NumberLong("1493010771257"), "scene" : "scene01" } { "_id" : ObjectId("58fd895259cb8757d493ce10"), "app" : "my_app", "eventId" : 141760986, "requestTime" : NumberLong("1493010770866"), "scene" : "scene01" } { "_id" : ObjectId("58fd895259cb8757d493ce09"), "app" : "my_app", "eventId" : 141760979, "requestTime" : NumberLong("1493010770757"), "scene" : "scene01" } { "_id" : ObjectId("58fd895259cb8757d493ce02"), "app" : "my_app", "eventId" : 141760972, "requestTime" : NumberLong("1493010770614"), "scene" : "scene03" } { "_id" : ObjectId("58fd895259cb8757d493cdf1"), "app" : "my_app", "eventId" : 141760957, "requestTime" : NumberLong("1493010770342"), "scene" : "scene02" } { "_id" : ObjectId("58fd895259cb8757d493cde6"), "app" : "my_app", "eventId" : 141760946, "requestTime" : NumberLong("1493010770258"), "scene" : "scene01" }

问题A

这两条查询都是走的什么索引呢？导致执行时间相差如此之大

和Mysql一样，Mongodb也提供了explain语句，可以获取query语句的查询计划（queryPlanner）、以及执行过程中的统计信息（executionStats）。

违和发散：Cassandra中也是有类似的功能，Hbase中目前是没有看到的。

在mongo shell中的使用方法是在query语句后面加上.explain('executionStats')，对于上面的good query，对应的explain语句为：

db.myColl.find({app:"my_app",requestTime:{$gte:1492502247000}}).sort({_id:-1}).limit(1).explain('executionStats')

good query的explain语句的执行结果如下，无关细节用...省略：

{ "queryPlanner" : { "plannerVersion" : 1, "namespace" : "myDatabase.myColl", "indexFilterSet" : false, "parsedQuery" : ... "winningPlan" : { "stage" : "LIMIT", "limitAmount" : 1, "inputStage" : { "stage" : "FETCH", "filter" : ..., "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "_id" : 1 }, "indexName" : "_id_", ... "direction" : "backward", "indexBounds" : { "_id" : [ "[MaxKey, MinKey]" ] } } } }, "rejectedPlans" : ..., }, "executionStats" : { "executionSuccess" : true, "nReturned" : 1, "executionTimeMillis" : 0, "totalKeysExamined" : 8, "totalDocsExamined" : 8, "executionStages" : { "stage" : "LIMIT", ... "inputStage" : { "stage" : "FETCH", ... "inputStage" : { "stage" : "IXSCAN", ... "direction" : "backward", "indexBounds" : { "_id" : [ "[MaxKey, MinKey]" ] }, "keysExamined" : 8, ... } } } }, "serverInfo" : ..., "ok" : 1 }

结果分为四部分：queryPlanner、executionStats、serverInfo、ok，仅关注queryPlanner、executionStats这两部分。

executionStats就是执行queryPlanner.winningPlan这个计划时的统计信息，可以从indexBounds看到good query在索引扫描（IXSCAN）阶段，使用的索引是_id主键索引。从IXSCAN这个阶段的keysExamined统计可以解释为什么good query执行的这么快，只扫描了8条数据。

同样使用explain语句看看bad query使用的是什么索引：

{ "queryPlanner" : { ... "winningPlan" : { "stage" : "SORT", ... "inputStage" : { "stage" : "SORT_KEY_GENERATOR", "inputStage" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "app" : 1, "scene" : 1, "eventId" : -1, "requestTime" : -1 }, "indexName" : "idx_app_1_scene_1_eventId_-1_requestTime_-1", ... "direction" : "forward", "indexBounds" : { "app" : [ "[\"my_app\", \"my_app\"]" ], "scene" : [ "[MinKey, MaxKey]" ], "eventId" : [ "[MaxKey, MinKey]" ], "requestTime" : [ "(1492588800000.0, 1492502247000.0]" ] } } } } }, "rejectedPlans" : ..., }, "executionStats" : { "executionSuccess" : true, "nReturned" : 1, "executionTimeMillis" : 56414, "totalKeysExamined" : 3124535, "totalDocsExamined" : 275157, "executionStages" : { "stage" : "SORT", ... "inputStage" : { "stage" : "SORT_KEY_GENERATOR", ... "inputStage" : { "stage" : "FETCH", ... "inputStage" : { "stage" : "IXSCAN", ... "direction" : "forward", "indexBounds" : { "app" : [ "[\"my_app\", \"my_app\"]" ], "scene" : [ "[MinKey, MaxKey]" ], "eventId" : [ "[MaxKey, MinKey]" ], "requestTime" : [ "(1492588800000.0, 1492502247000.0]" ] }, "keysExamined" : 3124535, ... } } } } }, "serverInfo" : ..., "ok" : 1 }

可以看到bad query使用的索引是一个复合索引（Compound Indexes），确实和good query使用的索引不一样。同样，从IXSCAN这个阶段的keysExamined统计可以看到扫描了3124535条数据，所以执行时间会很长。

问题B

如果两条查询选取的索引不同，为什么会有这个不同呢，这两条查询长得还是挺像的

Mongodb是如何为查询选取认为合适的索引的呢？

粗略来说，会先选几个候选的查询计划，然后会为这些查询计划按照某个规则来打分，分数最高的查询计划就是合适的查询计划，这个查询计划里面使用的索引就是认为合适的索引。

好，粗略地说完了，现在细致一点说（还是那句话：没有代码的解释都是耍流氓，以下所有的代码都是基于mongodb-3.2.10）。

先看一个栈：

mongo::PlanRanker::scoreTree mongo::PlanRanker::pickBestPlan mongo::MultiPlanStage::pickBestPlan mongo::PlanExecutor::pickBestPlan mongo::PlanExecutor::make mongo::PlanExecutor::make mongo::getExecutor mongo::getExecutorFind mongo::FindCmd::explain

这是使用lldb来调试mongod时，在mongo::PlanRanker::scoreTree（代码位于src/mongo/db/query/plan_ranker.cpp）处设置断点打印出来的栈。

scoreTree里面就是计算每个查询计划的得分的：

// We start all scores at 1. Our "no plan selected" score is 0 and we want all plans to // be greater than that. double baseScore = 1; // How many "units of work" did the plan perform. Each call to work(...) // counts as one unit. size_t workUnits = stats->common.works; // How much did a plan produce? // Range: [0, 1] double productivity = static_cast<double>(stats->common.advanced) / static_cast<double>(workUnits); ... double tieBreakers = noFetchBonus + noSortBonus + noIxisectBonus; double score = baseScore + productivity + tieBreakers;

scoreTree并没有执行查询，只是根据已有的PlanStageStats* stats来进行计算。那么，是什么时候执行查询来获取查询计划的PlanStageStats* stats的呢？

在mongo::MultiPlanStage::pickBestPlan（代码位于src/mongo/db/exec/multi_plan.cpp）中，会调用workAllPlans来执行所有的查询计划，最多会调用numWorks次：

size_t numWorks = getTrialPeriodWorks(getOpCtx(), _collection); size_t numResults = getTrialPeriodNumToReturn(*_query); // Work the plans, stopping when a plan hits EOF or returns some // fixed number of results. for (size_t ix = 0; ix < numWorks; ++ix) { bool moreToDo = workAllPlans(numResults, yieldPolicy); if (!moreToDo) { break; } }

问题C

如果bad query选取和good query一样的索引，是否还会有一样的问题呢

Mongodb查询时，可以借助于hint命令强制选取某一条索引来进行查询，比如上述的bad query加上.hint({_id:1})，就可以强制使用主键索引：

db.myColl.find({app:"my_app",requestTime:{$gte:1492502247000,$lt:1492588800000}}).sort({_id:-1}).limit(1).hint({_id:1})

然而，即使是这样，查询还是很慢，依然加上.explain('executionStats')看一下执行情况，解答问题A时已经对explain的结果做了些解释，所以这次着重看IXSCAN阶段的keysExamined：

{ ... "executionStages" : { "stage" : "LIMIT", ... "inputStage" : { "stage" : "FETCH", "filter" : { "$and" : [ { "app" : { "$eq" : "my_app" } }, { "requestTime" : { "$lt" : 1492588800000 } }, { "requestTime" : { "$gte" : 1492502247000 } } ] }, "nReturned" : 1, ... "inputStage" : { "stage" : "IXSCAN", ... "nReturned" : 32862524, ... "keysExamined" : 32862524, ... ... }

扫描了32862524条记录，依然很慢。这个现象比较好解释了，从executionStats.executionStages可以看到，加了hint的查询经历了LIMIT => FETCH => IXSCAN 这几个阶段，IXSCAN这个阶段返回了32862524条记录，被FETCH阶段过滤只剩下一条，所以有32862523条无效扫描，为什么会有这么多无效扫描呢？

这个和业务逻辑是相关的，requestTime时间戳是随时间增长的，主键_id也可以认为随时间增长的，所以按照主键索引倒序来，最开始被扫描的是最新的记录，最新的记录是满足"requestTime" : {"$gte" : 1492502247000}这个条件的，所以good query只需要满足"app" : {"$eq" : "my_app"}就会很快返回；

然而bad query的约束条件"requestTime" : {"$gte" : 1492502247000, "$lt" : 1492588800000}中的"$lt" : 1492588800000是无法被满足的，必须要把所有比1492588800000这个时间戳新的记录都扫描完了之后才会返回。

苏先生提出了完美的解决方案：不使用_id来排序，而是使用request_time来进行排序。这样就会使用"requestTime" : -1这条索引，只需要进行"app" : {"$eq" : "my_app"}的过滤，也是milliseconds时间内完成查询。

总结

搭建有效的线下调试环境是重现、解决问题的重要手段，例如之前重现zk问题时使用salt快速搭建本地集群维护开源产品不了解源码，或者没有找到看的有效入口，是很被动的，缺少定位解决问题的根本手段

参考

https://docs.mongodb.com/manual/http://www.cnblogs.com/xjk15082/archive/2011/09/18/2180792.htmlhttps://lldb.llvm.org/lldb-gdb.htmlhttps://github.com/mongodb/mongo/wiki/Build-Mongodb-From-Source

感谢林青大神在排查过程中提供的关键帮助。

最新回复(0)