ORB-SLAM2代码阅读笔记（五）：Tracking线程3——Track函数中单目相机初始化

xiaoxiao2024-12-02 94

Table of Contents

1.特征点匹配相关理论简介

2.ORB-SLAM2中特征匹配代码分析

（1）Tracking线程中的状态机

（2）单目相机初始化函数MonocularInitialization代码分析

（3）ORBmatcher::SearchForInitialization函数代码分析

（4）mpInitializer->Initialize函数代码分析

1）单应矩阵计算相机位姿

2）基础矩阵计算相机位姿

（5）CreateInitialMapMonocular函数代码分析

3.小结

在 ORB-SLAM2代码阅读笔记（四）：Tracking线程2——ORB特征提取中已经分析了从图像中提取关键点和描述子的代码流程，接下来就是对提取的特征点进行匹配。特征点匹配简单来说，就是把两帧图片中提取的特征进行对比，找到图片一中的特征点在图片二中相对应的特征点。这样一帧一帧的图片之间才有了关联，图片帧也才有了意义。

因为系统中要求只有输入的前两帧图中提取的特征点数都大于100，才能进行前期的初始化。结合实际看看，只有前边输入的两帧图像提取的特征够多，这时候才能去进行匹配进而计算出这两帧图像之间的相机位姿变换，后期输入的图像的位姿都是在此基础上进行估算的，所以说前两帧图像的初始化也就是为系统设定了一些初始值（正所谓有了个初始状态）。

1.特征点匹配相关理论简介

ORB-SLAM2中的特征点匹配分为关键点匹配和描述子匹配：

1）关键点匹配：ORB-SLAM2将图像提取的特征值“比较均匀”的分布在了48*64的网格中，这样当得到F1帧中的特征点坐标(x,y)后，就可以计算该(x,y)坐标在F2的的特征点网格中的哪个网格内。系统中的计算方法是在以(x,y)为中心半径为100的范围内进行匹配，查找F2帧在这个范围内和(x,y)距离最小于半径100的特征点，找到的都算作F1帧中坐标为(x,y)的特征点在F2帧中潜在的匹配点。这些潜在的匹配点会存放在一个列表中。

2）描述子匹配：在关键点匹配的基础上进行描述子匹配，此时遍历1）中得到的潜在的特征匹配点列表，计算每个特征点的描述子和F1中（x,y）所在的特征点描述子的距离，记录距离最小和次小的两个特征点。同时，最小距离要小于TH_LOW(代码中值为50)，并且要小于次小距离的0.9倍的距离，才算描述子基本匹配成功。最后，还要进行描述子方向的比较，只有描述子方向也相同才认为描述子匹配成功。

ORB-SLAM2这种划分网格的匹配方式比遍历一个个关键点进行暴力匹配的效率要高很多。并且分了关键点匹配和描述子匹配两个步骤，这样在关键点匹配完成后，就已经过滤了许多特征点，再进行描述子匹配就会轻松好多。

所以，ORB-SLAM2中特征点匹配成功的条件是：关键点匹配成功&&描述子距离符合要求&&描述子方向检测。

2.ORB-SLAM2中特征匹配代码分析

本部分的代码入口函数为Tracking.cc中的Track()函数。

Track函数代码按照流程主要分为3步：

1）第一次调用Track函数时，根据输入帧进行初始化。主要是对前两帧图像初始化和特征点匹配；

2）当初始化完成后，再有图像帧传入的时候，进行相机跟踪也就是位姿估计；

3）存储根据当前帧计算出的相机位姿（也就是变换矩阵等数据）；

Track函数代码大体结构如下图所示：

（1）Tracking线程中的状态机

Tracking中的状态机如下：

SYSTEM_NOT_READY=-1, //系统初始化开始前为FrameDrawer类对象中的mState赋值为SYSTEM_NOT_READY NO_IMAGES_YET=0, //如果图片复位过或者第一次运行，则为NO_IMAGES_YET NOT_INITIALIZED=1,//系统未初始化好时的状态标记 OK=2,//系统初始化完成或者重定位后会标记为OK状态 LOST=3 //局部地图跟踪失败的情况下标记为该状态，系统中为TrackLocalMap函数调用返回false的情况下给mState重置该状态

（2）单目相机初始化函数MonocularInitialization代码分析

/** * 单目相机初始化函数 * 功能：创建初始化器，并对前两个关键点数大于100的帧进行特征点匹配，根据匹配结果计算出当前帧的变换矩阵并在窗口中显示 * 1. 第一次进入该方法,如果当前帧关键点数>100,将当前帧保存为初始帧和最后一帧，并创建一个初始化器； * 2. 第二次进入该方法的时候，已经有初始化器了，如果当前帧中的关键点数>100； * 3. 利用ORB匹配器，对当前帧和初始帧进行匹配，匹配关键点数小于100时失败； * 4. 利用匹配的关键点信息进行单应矩阵和基础矩阵的计算，进而计算出相机位姿的旋转矩阵和平移矩阵； * 5. 进行三角化判断，删除不能三角化的无用特征关键点； * 6. 由旋转矩阵和平移矩阵构造变换矩阵； * 7. 将三角化得到的3D点包装成MapPoints，在地图中显示； */ void Tracking::MonocularInitialization() { //如果单目初始器还没有被创建，则创建单目初始器 if(!mpInitializer) { // Set Reference Frame //step 1：第一次进入该方法,如果当前帧关键点数>100,将当前帧保存为初始帧和最后一帧，并创建一个初始化器 if(mCurrentFrame.mvKeys.size()>100) { mInitialFrame = Frame(mCurrentFrame); mLastFrame = Frame(mCurrentFrame); //mvbPrevMatched的大小设置为已经提取的关键点的个数，mvbPrevMatched最大的情况就是所有特征点都被跟踪上 mvbPrevMatched.resize(mCurrentFrame.mvKeysUn.size()); for(size_t i=0; i<mCurrentFrame.mvKeysUn.size(); i++) //mvbPrevMatched中存储关键点的坐标，用于新的一帧到来时进行匹配 mvbPrevMatched[i]=mCurrentFrame.mvKeysUn[i].pt; if(mpInitializer) delete mpInitializer; //由当前帧构造初始器 sigma:1.0 iterations:200 mpInitializer = new Initializer(mCurrentFrame,1.0,200); //mvIniMatches中所有的元素值设置为-1 fill(mvIniMatches.begin(),mvIniMatches.end(),-1); return; } } else//如果是第二次进入，这时候已经创建了初始器 { // Try to initialize /** * step 2：第二次进入该方法的时候，已经有初始化器了，如果当前帧中的关键点数>100，则继续进行匹配工作。 * 如果当前帧特征点太少，释放初始化器。因此只有连续两帧的特征点个数都大于100时，才能继续进行初始化过程 */ if((int)mCurrentFrame.mvKeys.size()<=100) { //特征点数少于100，此时删除初始化器 delete mpInitializer; mpInitializer = static_cast<Initializer*>(NULL); fill(mvIniMatches.begin(),mvIniMatches.end(),-1); return; } // Find correspondences //创建ORBmatcher ORBmatcher matcher(0.9,true); //计算当前帧和初始化帧之间的匹配关系 /** * step 3：在mInitialFrame与mCurrentFrame中找匹配的特征点对 * mvbPrevMatched为前一帧的特征点的坐标，存储了mInitialFrame中哪些点将进行接下来的匹配 * mvIniMatches用于存储mInitialFrame,mCurrentFrame之间匹配的特征点 */ int nmatches = matcher.SearchForInitialization(mInitialFrame,mCurrentFrame,mvbPrevMatched,mvIniMatches,100); // Check if there are enough correspondences if(nmatches<100) { //释放初始化器 delete mpInitializer; mpInitializer = static_cast<Initializer*>(NULL); return; } cv::Mat Rcw; // Current Camera Rotation 定义相机旋转矩阵 cv::Mat tcw; // Current Camera Translation 定义相机平移矩阵 vector<bool> vbTriangulated; // Triangulated Correspondences (mvIniMatches)三角化对应关系 //step 4：利用匹配的关键点信息进行单应矩阵和基础矩阵的计算，进而计算出相机位姿的旋转矩阵和平移矩阵 //在该函数中创建了计算单应矩阵和基础矩阵的两个线程，计算的旋转和平移量值存放在Rcw和tcw中 //mvIniMatches[i]中i为前一帧匹配的关键点的index，值为当前帧的匹配的关键点的index if(mpInitializer->Initialize(mCurrentFrame, mvIniMatches, Rcw, tcw, mvIniP3D, vbTriangulated)) { //step 5：删除那些无法进行三角化的匹配点 for(size_t i=0, iend=mvIniMatches.size(); i<iend;i++) { //判断该点是否可以三角化 if(mvIniMatches[i]>=0 && !vbTriangulated[i]) { //表示两帧对应的关键点不再匹配 mvIniMatches[i]=-1; //关键点匹配个数-1 nmatches--; } } // Set Frame Poses //将初始化的第一帧作为世界坐标系，因此第一帧变换矩阵为单位矩阵 mInitialFrame.SetPose(cv::Mat::eye(4,4,CV_32F)); /** * step 6：由旋转矩阵和平移矩阵构造变换矩阵 * 由Rcw和tcw构造Tcw,并赋值给mTcw，mTcw为世界坐标系到该帧的变换矩阵 * 这里构造出来的Tcw为一个4*4的矩阵，其中的Rcw为3*3，tcw为3*1如下所示： * |Rcw tcw| * |0 1 | */ cv::Mat Tcw = cv::Mat::eye(4,4,CV_32F); Rcw.copyTo(Tcw.rowRange(0,3).colRange(0,3)); tcw.copyTo(Tcw.rowRange(0,3).col(3)); mCurrentFrame.SetPose(Tcw); /** * step 7：将三角化得到的3D点包装成MapPoints，在地图中显示 * Initialize函数会得到mvIniP3D， * mvIniP3D是cv::Point3f类型的一个容器，是个存放3D点的临时变量， * CreateInitialMapMonocular将3D点包装成MapPoint类型存入KeyFrame和Map中 */ CreateInitialMapMonocular(); } } }

MonocularInitialization()函数中进行两帧间特征点匹配的函数为下面这行代码，调用ORBmatcher中的SearchForInitialization函数。

//计算当前帧和初始化帧之间的匹配关系 /** * step 3：在mInitialFrame与mCurrentFrame中找匹配的特征点对 * mvbPrevMatched为前一帧的特征点的坐标，存储了mInitialFrame中哪些点将进行接下来的匹配 * mvIniMatches用于存储mInitialFrame,mCurrentFrame之间匹配的特征点 */ int nmatches = matcher.SearchForInitialization(mInitialFrame,mCurrentFrame,mvbPrevMatched,mvIniMatches,100);

（3）ORBmatcher::SearchForInitialization函数代码分析

/** * 功能：对两个帧F1和F2进行特征点匹配 * F1 为初始化帧 * F2 为当前帧 * vbPrevMatched 为初始帧的特征点坐标 * vnMatches12 用于存储F1和F2之间匹配的特征点 * windowSize 窗口大小，为100 * 思路： * 1.遍历F1中的关键点，对每一个取出来的关键点再到F2中存放关键点的网格里找距离小于r的对应关键点； * 2.遍历从F2中找出来的关键点，并计算F1对应关键点和F2这些关键点之间描述子的距离，计算出最小距离和次小距离； */ int ORBmatcher::SearchForInitialization(Frame &F1, Frame &F2, vector<cv::Point2f> &vbPrevMatched, vector<int> &vnMatches12, int windowSize) { int nmatches=0; //vnMatches12的大小为前一帧中特征点数的大小，初始化元素值为-1 vnMatches12 = vector<int>(F1.mvKeysUn.size(),-1); //创建了HISTO_LENGTH=30大小、类型为vector<int>的数组 vector<int> rotHist[HISTO_LENGTH]; for(int i=0;i<HISTO_LENGTH;i++) rotHist[i].reserve(500); //比例因子，1/30 const float factor = 1.0f/HISTO_LENGTH; //匹配的描述子距离存储，初始值为INT_MAX vector<int> vMatchedDistance(F2.mvKeysUn.size(),INT_MAX); //记录当前帧和初始化帧的匹配 vector<int> vnMatches21(F2.mvKeysUn.size(),-1); //遍历初始化帧中的特征点 for(size_t i1=0, iend1=F1.mvKeysUn.size(); i1<iend1; i1++) { cv::KeyPoint kp1 = F1.mvKeysUn[i1]; //只处理金字塔的第0层中提取的特征点，也就是原始图 int level1 = kp1.octave; if(level1>0) continue; //在当前图像帧中半径为windowSize（100）的范围内搜索与初始帧x,y坐标位置的关键点可能匹配的关键点的索引 //vbPrevMatched中存放的是前一帧的关键点的坐标 //vIndices2为输入的(x,y)和F2中关键点距离<windowSize(100)的关键点的index值 vector<size_t> vIndices2 = F2.GetFeaturesInArea(vbPrevMatched[i1].x,vbPrevMatched[i1].y, windowSize,level1,level1); if(vIndices2.empty()) continue; //F1中kp1关键点对应的描述子 cv::Mat d1 = F1.mDescriptors.row(i1); int bestDist = INT_MAX; int bestDist2 = INT_MAX; int bestIdx2 = -1; //查找F2帧中找到的与F1(x,y)距离<r的关键点中和F1的当前关键点描述子距离最小的描述子 for(vector<size_t>::iterator vit=vIndices2.begin(); vit!=vIndices2.end(); vit++) { size_t i2 = *vit; cv::Mat d2 = F2.mDescriptors.row(i2); //计算两个描述子的距离 int dist = DescriptorDistance(d1,d2); if(vMatchedDistance[i2]<=dist) continue; //找到最小的前两个距离，dist先和最小距离比较，然后再和第二小距离比较 if(dist<bestDist) { //先把当前记录的最小距离赋值给第二小的距离变量，再把当前距离赋值给最小距离变量 bestDist2=bestDist; bestDist=dist; //记录最小距离的index值 bestIdx2=i2; } else if(dist<bestDist2) { bestDist2=dist; } } //TH_LOW==50，确保最小距离小于阈值50 if(bestDist<=TH_LOW) { //再确保最小距离要小于次小距离乘以mfNNratio=0.9 if(bestDist<(float)bestDist2*mfNNratio) { //如果已经匹配，则说明当前特征已经有过对应，则就会有两个对应，移除该匹配 if(vnMatches21[bestIdx2]>=0) { vnMatches12[vnMatches21[bestIdx2]]=-1; nmatches--; } //bestIdx2里存放的是F2中和F1当前遍历关键点描述子距离最小的index值 //则vnMatches12[i1]中存储的是F1中关键点index为i1对应的F2中关键点的index vnMatches12[i1]=bestIdx2; //vnMatches21[i1]中存储的是F2中index为bestIdx2的关键点对应的F1中关键点的index值 vnMatches21[bestIdx2]=i1; //vMatchedDistance[bestIdx2]表示F2中index为bestIdx2的关键点，和F1中描述子距离最小的关键点之间的距离 vMatchedDistance[bestIdx2]=bestDist; //记录匹配的特征点个数 nmatches++; //是否检查方向 if(mbCheckOrientation) { float rot = F1.mvKeysUn[i1].angle-F2.mvKeysUn[bestIdx2].angle; if(rot<0.0) rot+=360.0f; //rot为关键点的方向差值 int bin = round(rot*factor);//factor=1/30进行四舍五入得到bin if(bin==HISTO_LENGTH)//bin如果等于30，则将bin置为0 bin=0; assert(bin>=0 && bin<HISTO_LENGTH); //rotHist中bin的位置存放F1中关键点的index值 rotHist[bin].push_back(i1);//得到直方 } } } } //进行方向匹配 if(mbCheckOrientation) { int ind1=-1; int ind2=-1; int ind3=-1; ComputeThreeMaxima(rotHist,HISTO_LENGTH,ind1,ind2,ind3); for(int i=0; i<HISTO_LENGTH; i++) { if(i==ind1 || i==ind2 || i==ind3) continue; for(size_t j=0, jend=rotHist[i].size(); j<jend; j++) { int idx1 = rotHist[i][j]; if(vnMatches12[idx1]>=0) { vnMatches12[idx1]=-1; nmatches--; } } } } //Update prev matched //vbPrevMatched[i1]中存放的是F1中index为i1的关键点匹配的F2中关键点的坐标值 for(size_t i1=0, iend1=vnMatches12.size(); i1<iend1; i1++) if(vnMatches12[i1]>=0) vbPrevMatched[i1]=F2.mvKeysUn[vnMatches12[i1]].pt; return nmatches; }

GetFeaturesInArea函数代码解析如下：

/** * 根据选取的初始帧的特征点的位置(x,y)作为输入，然后在第二帧该位置半径r的范围内，寻找可能匹配的点，注意这边的匹配只考虑了原始图像即尺度图像的第一层的特征 * 思路： * 1.从初始化帧的关键点坐标(x,y)计算出该关键点在网格中半径为r的范围； * 2.遍历这个范围内的网格，获取当前帧在这个范围内的关键点的坐标； * 3.并计算该坐标关键点距离初始化关键点(x,y)的距离，<r表示匹配，将匹配的关键点index存入vIndices中； * 输出：vIndices中存的是距离<r的关键点的index,也就表示匹配的关键点的index */ vector<size_t> Frame::GetFeaturesInArea(const float &x, const float &y, const float &r, const int minLevel, const int maxLevel) const { vector<size_t> vIndices; vIndices.reserve(N); //计算以关键点坐标(x,y)为中心，半径为r的的区域在48*64的网格中的位置 const int nMinCellX = max(0,(int)floor((x-mnMinX-r)*mfGridElementWidthInv)); if(nMinCellX>=FRAME_GRID_COLS) return vIndices; const int nMaxCellX = min((int)FRAME_GRID_COLS-1,(int)ceil((x-mnMinX+r)*mfGridElementWidthInv)); if(nMaxCellX<0) return vIndices; const int nMinCellY = max(0,(int)floor((y-mnMinY-r)*mfGridElementHeightInv)); if(nMinCellY>=FRAME_GRID_ROWS) return vIndices; const int nMaxCellY = min((int)FRAME_GRID_ROWS-1,(int)ceil((y-mnMinY+r)*mfGridElementHeightInv)); if(nMaxCellY<0) return vIndices; const bool bCheckLevels = (minLevel>0) || (maxLevel>=0); //遍历mGrid网格，这个网格中存放的是特征点坐标值 //查找当前帧的关键点中和传入的上个帧的关键点（x,y）距离<r的关键点 for(int ix = nMinCellX; ix<=nMaxCellX; ix++) { for(int iy = nMinCellY; iy<=nMaxCellY; iy++) { //从mGrid中获取当前Frame网格里的关键点的index值，可以通过对应的index在mvKeysUn中获取关键点 const vector<size_t> vCell = mGrid[ix][iy]; if(vCell.empty()) continue; for(size_t j=0, jend=vCell.size(); j<jend; j++) { //vCell[j]获取的是关键点在mvKeysUn中的index值 const cv::KeyPoint &kpUn = mvKeysUn[vCell[j]]; if(bCheckLevels) { if(kpUn.octave<minLevel) continue; if(maxLevel>=0) if(kpUn.octave>maxLevel) continue; } //(x,y)为初始化帧的关键点的坐标，kpUn.pt.x、kpUn.pt.y为当前帧的关键点 const float distx = kpUn.pt.x-x; const float disty = kpUn.pt.y-y; if(fabs(distx)<r && fabs(disty)<r) vIndices.push_back(vCell[j]); } } } return vIndices; }

（4）mpInitializer->Initialize函数代码分析

/** * mvIniMatches[i]中i为前一帧匹配的关键点的index，值为当前帧的匹配的关键点的index * 该函数计算出来的旋转结果存放在R21,平移结果值存放在t21中 * */ bool Initializer::Initialize(const Frame &CurrentFrame, const vector<int> &vMatches12, cv::Mat &R21, cv::Mat &t21, vector<cv::Point3f> &vP3D, vector<bool> &vbTriangulated) { // Fill structures with current keypoints and matches with reference frame // Reference Frame: 1, Current Frame: 2 //参考帧：1 当前帧：2 mvKeys2 = CurrentFrame.mvKeysUn; //clear只是将vector的size置零，可是并不保证capacity为零，因此clear并不能释放vector已经申请的内存 mvMatches12.clear(); mvMatches12.reserve(mvKeys2.size()); mvbMatched1.resize(mvKeys1.size()); //对匹配对应关系进行筛选 for(size_t i=0, iend=vMatches12.size();i<iend; i++) { if(vMatches12[i]>=0) { //mvMatches12中存放的pair里的i为前一帧的关键点的index,vMatches12[i]为CurrentFrame中与i想匹配的关键点的index mvMatches12.push_back(make_pair(i,vMatches12[i])); //mvbMatched1中的index为前一帧中关键点的index mvbMatched1[i]=true; } else mvbMatched1[i]=false; } const int N = mvMatches12.size(); // Indices for minimum set selection vector<size_t> vAllIndices; vAllIndices.reserve(N); vector<size_t> vAvailableIndices; //遍历所有匹配的关键点的index for(int i=0; i<N; i++) { vAllIndices.push_back(i); } // Generate sets of 8 points for each RANSAC iteration mMaxIterations:200 //8点法的使用 mvSets = vector< vector<size_t> >(mMaxIterations,vector<size_t>(8,0)); DUtils::Random::SeedRandOnce(0); //遍历200次，每次都随机取8个点对 for(int it=0; it<mMaxIterations; it++) { vAvailableIndices = vAllIndices; // Select a minimum set for(size_t j=0; j<8; j++) { int randi = DUtils::Random::RandomInt(0,vAvailableIndices.size()-1); int idx = vAvailableIndices[randi]; //将随机生成的index值存放在容器当中 mvSets[it][j] = idx; //vAvailableIndices末尾元素的存储 vAvailableIndices[randi] = vAvailableIndices.back(); //删除vAvailableIndices中的末尾元素 vAvailableIndices.pop_back(); } } // Launch threads to compute in parallel a fundamental matrix and a homography /** * 为了能够避免退化现象造成的影响，通常我们会同时估计基础矩阵F和单应矩阵H, * 选择重投影误差比较小的那个作为最终的运动估计矩阵 * ---来自高翔《视觉SLAM十四讲》 * */ vector<bool> vbMatchesInliersH, vbMatchesInliersF; float SH, SF; cv::Mat H, F; //开启线程计算单应矩阵 thread threadH(&Initializer::FindHomography,this,ref(vbMatchesInliersH), ref(SH), ref(H)); //开启线程计算基础矩阵 thread threadF(&Initializer::FindFundamental,this,ref(vbMatchesInliersF), ref(SF), ref(F)); // Wait until both threads have finished threadH.join(); threadF.join(); // Compute ratio of scores float RH = SH/(SH+SF); // Try to reconstruct from homography or fundamental depending on the ratio (0.40-0.45) //单应矩阵分值的比例超过0.40则用单应矩阵来恢复运动，这里计算出来的旋转向量结果存放在R21中,平移向量存放在t21中 if(RH>0.40) //单应矩阵恢复运动 return ReconstructH(vbMatchesInliersH,H,mK,R21,t21,vP3D,vbTriangulated,1.0,50); else //if(pF_HF>0.6) //基础矩阵恢复运动 return ReconstructF(vbMatchesInliersF,F,mK,R21,t21,vP3D,vbTriangulated,1.0,50); return false; }

1）单应矩阵计算相机位姿

计算单应矩阵的线程入口函数如下：

/** * 计算单应矩阵 * */ void Initializer::FindHomography(vector<bool> &vbMatchesInliers, float &score, cv::Mat &H21) { // Number of putative matches //mvMatches12中存放的pair里的i为前一帧的关键点的index,vMatches12[i]为CurrentFrame中与i想匹配的关键点的index const int N = mvMatches12.size(); // Normalize coordinates vector<cv::Point2f> vPn1, vPn2; cv::Mat T1, T2; //mvKeys1为参考帧的关键点，mvKeys2为当前帧的关键点 Normalize(mvKeys1,vPn1, T1); Normalize(mvKeys2,vPn2, T2); //计算T2的逆 cv::Mat T2inv = T2.inv(); // Best Results variables score = 0.0; vbMatchesInliers = vector<bool>(N,false); // Iteration variables vector<cv::Point2f> vPn1i(8); vector<cv::Point2f> vPn2i(8); cv::Mat H21i, H12i; vector<bool> vbCurrentInliers(N,false); float currentScore; // Perform all RANSAC iterations and save the solution with highest score //mMaxIterations值为200 for(int it=0; it<mMaxIterations; it++) { // Select a minimum set //每次取8对点在ComputeH21中进行计算 for(size_t j=0; j<8; j++) { int idx = mvSets[it][j]; //获取匹配的关键点对中参考帧里的index值 vPn1i[j] = vPn1[mvMatches12[idx].first]; //获取匹配的关键点对中当前帧里的index值 vPn2i[j] = vPn2[mvMatches12[idx].second]; } //根据匹配的8对点计算 cv::Mat Hn = ComputeH21(vPn1i,vPn2i); H21i = T2inv*Hn*T1; H12i = H21i.inv(); currentScore = CheckHomography(H21i, H12i, vbCurrentInliers, mSigma); if(currentScore>score) { H21 = H21i.clone(); vbMatchesInliers = vbCurrentInliers; score = currentScore; } } } //8对点法计算单应矩阵 cv::Mat Initializer::ComputeH21(const vector<cv::Point2f> &vP1, const vector<cv::Point2f> &vP2) { //因为是8点法计算，所以这里的N值为8 const int N = vP1.size(); //A为16*9的矩阵 cv::Mat A(2*N,9,CV_32F); //遍历这8对点 for(int i=0; i<N; i++) { const float u1 = vP1[i].x; const float v1 = vP1[i].y; const float u2 = vP2[i].x; const float v2 = vP2[i].y; A.at<float>(2*i,0) = 0.0; A.at<float>(2*i,1) = 0.0; A.at<float>(2*i,2) = 0.0; A.at<float>(2*i,3) = -u1; A.at<float>(2*i,4) = -v1; A.at<float>(2*i,5) = -1; A.at<float>(2*i,6) = v2*u1; A.at<float>(2*i,7) = v2*v1; A.at<float>(2*i,8) = v2; A.at<float>(2*i+1,0) = u1; A.at<float>(2*i+1,1) = v1; A.at<float>(2*i+1,2) = 1; A.at<float>(2*i+1,3) = 0.0; A.at<float>(2*i+1,4) = 0.0; A.at<float>(2*i+1,5) = 0.0; A.at<float>(2*i+1,6) = -u2*u1; A.at<float>(2*i+1,7) = -u2*v1; A.at<float>(2*i+1,8) = -u2; } cv::Mat u,w,vt; //进行SVD分解 cv::SVDecomp(A,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV); return vt.row(8).reshape(0, 3); }

从单应矩阵计算相机位姿：

bool Initializer::ReconstructH(vector<bool> &vbMatchesInliers, cv::Mat &H21, cv::Mat &K, cv::Mat &R21, cv::Mat &t21, vector<cv::Point3f> &vP3D, vector<bool> &vbTriangulated, float minParallax, int minTriangulated) { int N=0; for(size_t i=0, iend = vbMatchesInliers.size() ; i<iend; i++) if(vbMatchesInliers[i]) N++; // We recover 8 motion hypotheses using the method of Faugeras et al. // Motion and structure from motion in a piecewise planar environment. // International Journal of Pattern Recognition and Artificial Intelligence, 1988 cv::Mat invK = K.inv(); cv::Mat A = invK*H21*K; cv::Mat U,w,Vt,V; //SVD分解 cv::SVD::compute(A,w,U,Vt,cv::SVD::FULL_UV); V=Vt.t(); float s = cv::determinant(U)*cv::determinant(Vt); float d1 = w.at<float>(0); float d2 = w.at<float>(1); float d3 = w.at<float>(2); if(d1/d2<1.00001 || d2/d3<1.00001) { return false; } vector<cv::Mat> vR, vt, vn; vR.reserve(8); vt.reserve(8); vn.reserve(8); //n'=[x1 0 x3] 4 posibilities e1=e3=1, e1=1 e3=-1, e1=-1 e3=1, e1=e3=-1 float aux1 = sqrt((d1*d1-d2*d2)/(d1*d1-d3*d3)); float aux3 = sqrt((d2*d2-d3*d3)/(d1*d1-d3*d3)); float x1[] = {aux1,aux1,-aux1,-aux1}; float x3[] = {aux3,-aux3,aux3,-aux3}; //case d'=d2 float aux_stheta = sqrt((d1*d1-d2*d2)*(d2*d2-d3*d3))/((d1+d3)*d2); float ctheta = (d2*d2+d1*d3)/((d1+d3)*d2); float stheta[] = {aux_stheta, -aux_stheta, -aux_stheta, aux_stheta}; for(int i=0; i<4; i++) { cv::Mat Rp=cv::Mat::eye(3,3,CV_32F); Rp.at<float>(0,0)=ctheta; Rp.at<float>(0,2)=-stheta[i]; Rp.at<float>(2,0)=stheta[i]; Rp.at<float>(2,2)=ctheta; cv::Mat R = s*U*Rp*Vt; vR.push_back(R); cv::Mat tp(3,1,CV_32F); tp.at<float>(0)=x1[i]; tp.at<float>(1)=0; tp.at<float>(2)=-x3[i]; tp*=d1-d3; cv::Mat t = U*tp; vt.push_back(t/cv::norm(t)); cv::Mat np(3,1,CV_32F); np.at<float>(0)=x1[i]; np.at<float>(1)=0; np.at<float>(2)=x3[i]; cv::Mat n = V*np; if(n.at<float>(2)<0) n=-n; vn.push_back(n); } //case d'=-d2 float aux_sphi = sqrt((d1*d1-d2*d2)*(d2*d2-d3*d3))/((d1-d3)*d2); float cphi = (d1*d3-d2*d2)/((d1-d3)*d2); float sphi[] = {aux_sphi, -aux_sphi, -aux_sphi, aux_sphi}; for(int i=0; i<4; i++) { cv::Mat Rp=cv::Mat::eye(3,3,CV_32F); Rp.at<float>(0,0)=cphi; Rp.at<float>(0,2)=sphi[i]; Rp.at<float>(1,1)=-1; Rp.at<float>(2,0)=sphi[i]; Rp.at<float>(2,2)=-cphi; cv::Mat R = s*U*Rp*Vt; vR.push_back(R); cv::Mat tp(3,1,CV_32F); tp.at<float>(0)=x1[i]; tp.at<float>(1)=0; tp.at<float>(2)=x3[i]; tp*=d1+d3; cv::Mat t = U*tp; vt.push_back(t/cv::norm(t)); cv::Mat np(3,1,CV_32F); np.at<float>(0)=x1[i]; np.at<float>(1)=0; np.at<float>(2)=x3[i]; cv::Mat n = V*np; if(n.at<float>(2)<0) n=-n; vn.push_back(n); } int bestGood = 0; int secondBestGood = 0; int bestSolutionIdx = -1; float bestParallax = -1; vector<cv::Point3f> bestP3D; vector<bool> bestTriangulated; // Instead of applying the visibility constraints proposed in the Faugeras' paper (which could fail for points seen with low parallax) // We reconstruct all hypotheses and check in terms of triangulated points and parallax for(size_t i=0; i<8; i++) { float parallaxi; vector<cv::Point3f> vP3Di; vector<bool> vbTriangulatedi; int nGood = CheckRT(vR[i],vt[i],mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K,vP3Di, 4.0*mSigma2, vbTriangulatedi, parallaxi); if(nGood>bestGood) { secondBestGood = bestGood; bestGood = nGood; bestSolutionIdx = i; bestParallax = parallaxi; bestP3D = vP3Di; bestTriangulated = vbTriangulatedi; } else if(nGood>secondBestGood) { secondBestGood = nGood; } } if(secondBestGood<0.75*bestGood && bestParallax>=minParallax && bestGood>minTriangulated && bestGood>0.9*N) { vR[bestSolutionIdx].copyTo(R21); vt[bestSolutionIdx].copyTo(t21); vP3D = bestP3D; vbTriangulated = bestTriangulated; return true; } return false; }

2）基础矩阵计算相机位姿

计算基础矩阵的线程入口函数如下：

/** * 计算基础矩阵 * */ void Initializer::FindFundamental(vector<bool> &vbMatchesInliers, float &score, cv::Mat &F21) { // Number of putative matches const int N = vbMatchesInliers.size(); // Normalize coordinates vector<cv::Point2f> vPn1, vPn2; cv::Mat T1, T2; Normalize(mvKeys1,vPn1, T1); Normalize(mvKeys2,vPn2, T2); cv::Mat T2t = T2.t(); // Best Results variables score = 0.0; vbMatchesInliers = vector<bool>(N,false); // Iteration variables vector<cv::Point2f> vPn1i(8); vector<cv::Point2f> vPn2i(8); cv::Mat F21i; vector<bool> vbCurrentInliers(N,false); float currentScore; // Perform all RANSAC iterations and save the solution with highest score for(int it=0; it<mMaxIterations; it++) { // Select a minimum set for(int j=0; j<8; j++) { int idx = mvSets[it][j]; vPn1i[j] = vPn1[mvMatches12[idx].first]; vPn2i[j] = vPn2[mvMatches12[idx].second]; } cv::Mat Fn = ComputeF21(vPn1i,vPn2i); F21i = T2t*Fn*T1; currentScore = CheckFundamental(F21i, vbCurrentInliers, mSigma); if(currentScore>score) { F21 = F21i.clone(); vbMatchesInliers = vbCurrentInliers; score = currentScore; } } } /** * 计算基础矩阵 * 本质矩阵为E,vP2的坐标为(u2,v2,1),vP1的坐标为(u1,v1,1) * 公式vP2*E*vP1=0成立，展开来就是下面公式： * |e1 e2 e3| |u1| * (u2, v1, 1) * |e4 e5 e6| *|v1| = 0 * |e7 e8 e9| |1 | * 计算整理可得： * u2*u1*e1 + u2*v1*e2 + u2*e3 + v2*u1*e4 + v2*v1*e5 + v2*e6 + u1*e7 + v1*e8 + 1*e9 = 0 * 将E作为向量提取出来可得： * (u2*u1 + u2*v1 + u2 + v2*u1 + v2*v1 + v2 + u1 + v1 + 1) * E = 0 * 这个函数中构造的A矩阵就是根据这个来构造的，可以求出本质矩阵E,进而根据公式F=K^-T*E*K^-1来计算出F * */ cv::Mat Initializer::ComputeF21(const vector<cv::Point2f> &vP1,const vector<cv::Point2f> &vP2) { const int N = vP1.size(); cv::Mat A(N,9,CV_32F); for(int i=0; i<N; i++) { const float u1 = vP1[i].x; const float v1 = vP1[i].y; const float u2 = vP2[i].x; const float v2 = vP2[i].y; A.at<float>(i,0) = u2*u1; A.at<float>(i,1) = u2*v1; A.at<float>(i,2) = u2; A.at<float>(i,3) = v2*u1; A.at<float>(i,4) = v2*v1; A.at<float>(i,5) = v2; A.at<float>(i,6) = u1; A.at<float>(i,7) = v1; A.at<float>(i,8) = 1; } cv::Mat u,w,vt; cv::SVDecomp(A,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV); cv::Mat Fpre = vt.row(8).reshape(0, 3); cv::SVDecomp(Fpre,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV); w.at<float>(2)=0; return u*cv::Mat::diag(w)*vt; }

从基础矩阵计算相机位姿：

/** * 思路：利用基础矩阵计算出本质矩阵，再从本质矩阵中分解出旋转和平移矩阵 * */ bool Initializer::ReconstructF(vector<bool> &vbMatchesInliers, cv::Mat &F21, cv::Mat &K, cv::Mat &R21, cv::Mat &t21, vector<cv::Point3f> &vP3D, vector<bool> &vbTriangulated, float minParallax, int minTriangulated) { int N=0; for(size_t i=0, iend = vbMatchesInliers.size() ; i<iend; i++) if(vbMatchesInliers[i]) N++; // Compute Essential Matrix from Fundamental Matrix /** * 根据基础矩阵来计算本质矩阵 * E = t^ * R,F = K^-T * E * K^-1 (K^-T为K矩阵转置的逆矩阵，K^-1为K的逆矩阵) * 此时F左乘K^T,右成K，可得K^T * F * K = E,下面的本质矩阵E21就是这么来计算的 * */ cv::Mat E21 = K.t()*F21*K; cv::Mat R1, R2, t; // Recover the 4 motion hypotheses //利用本质矩阵E21来恢复四种情况的运动假设 DecomposeE(E21,R1,R2,t); cv::Mat t1=t; cv::Mat t2=-t; // Reconstruct with the 4 hyphoteses and check vector<cv::Point3f> vP3D1, vP3D2, vP3D3, vP3D4; vector<bool> vbTriangulated1,vbTriangulated2,vbTriangulated3, vbTriangulated4; float parallax1,parallax2, parallax3, parallax4; //对计算出来的4种解进行相机深度值的检查，判断标准是关键点在两个帧中计算出的深度都要为正值 int nGood1 = CheckRT(R1,t1,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D1, 4.0*mSigma2, vbTriangulated1, parallax1); int nGood2 = CheckRT(R2,t1,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D2, 4.0*mSigma2, vbTriangulated2, parallax2); int nGood3 = CheckRT(R1,t2,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D3, 4.0*mSigma2, vbTriangulated3, parallax3); int nGood4 = CheckRT(R2,t2,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D4, 4.0*mSigma2, vbTriangulated4, parallax4); int maxGood = max(nGood1,max(nGood2,max(nGood3,nGood4))); R21 = cv::Mat(); t21 = cv::Mat(); int nMinGood = max(static_cast<int>(0.9*N),minTriangulated); int nsimilar = 0; if(nGood1>0.7*maxGood) nsimilar++; if(nGood2>0.7*maxGood) nsimilar++; if(nGood3>0.7*maxGood) nsimilar++; if(nGood4>0.7*maxGood) nsimilar++; // If there is not a clear winner or not enough triangulated points reject initialization if(maxGood<nMinGood || nsimilar>1) { return false; } // If best reconstruction has enough parallax initialize if(maxGood==nGood1) { if(parallax1>minParallax) { vP3D = vP3D1; vbTriangulated = vbTriangulated1; R1.copyTo(R21); t1.copyTo(t21); return true; } }else if(maxGood==nGood2) { if(parallax2>minParallax) { vP3D = vP3D2; vbTriangulated = vbTriangulated2; R2.copyTo(R21); t1.copyTo(t21); return true; } }else if(maxGood==nGood3) { if(parallax3>minParallax) { vP3D = vP3D3; vbTriangulated = vbTriangulated3; R1.copyTo(R21); t2.copyTo(t21); return true; } }else if(maxGood==nGood4) { if(parallax4>minParallax) { vP3D = vP3D4; vbTriangulated = vbTriangulated4; R2.copyTo(R21); t2.copyTo(t21); return true; } } return false; } //E=t^R 对E进行分解，求出t和R void Initializer::DecomposeE(const cv::Mat &E, cv::Mat &R1, cv::Mat &R2, cv::Mat &t) { cv::Mat u,w,vt; cv::SVD::compute(E,w,u,vt); u.col(2).copyTo(t); t=t/cv::norm(t); cv::Mat W(3,3,CV_32F,cv::Scalar(0)); W.at<float>(0,1)=-1; W.at<float>(1,0)=1; W.at<float>(2,2)=1; R1 = u*W*vt; if(cv::determinant(R1)<0) R1=-R1; R2 = u*W.t()*vt; if(cv::determinant(R2)<0) R2=-R2; }

（5）CreateInitialMapMonocular函数代码分析

/** * 功能：为关键帧初始化生成对应的MapPoints */ void Tracking::CreateInitialMapMonocular() { // Create KeyFrames //创建初始帧和当前帧 KeyFrame* pKFini = new KeyFrame(mInitialFrame,mpMap,mpKeyFrameDB); KeyFrame* pKFcur = new KeyFrame(mCurrentFrame,mpMap,mpKeyFrameDB); //1：计算初始化关键帧和当前关键帧的BOW pKFini->ComputeBoW(); pKFcur->ComputeBoW(); // Insert KFs in the map //2.将基础关键帧和当前关键帧插入地图中，地图中就会显示 mpMap->AddKeyFrame(pKFini); mpMap->AddKeyFrame(pKFcur); // Create MapPoints and asscoiate to keyframes //3.遍历所有匹配的关键点创建对应的mapPoint、 for(size_t i=0; i<mvIniMatches.size();i++) { if(mvIniMatches[i]<0) continue; //Create MapPoint. //用已经初始化好的3D点来创建world坐标 cv::Mat worldPos(mvIniP3D[i]); // 3.1：用3D点构造MapPoint MapPoint* pMP = new MapPoint(worldPos,pKFcur,mpMap); //i为对应关键点的index值 pKFini->AddMapPoint(pMP,i); pKFcur->AddMapPoint(pMP,mvIniMatches[i]); /** * 3.2：为该MapPoint添加相关属性 */ //1）.哪些关键帧可以观测到该MapPoint pMP->AddObservation(pKFini,i); //mvIniMatches[i]为pMP这个MapPoint在pKFcur这个关键帧中对应的关键点的index值 pMP->AddObservation(pKFcur,mvIniMatches[i]); //2).从众多观测到该MapPoint的特征点中挑选区分度最高的描述子 pMP->ComputeDistinctiveDescriptors(); //3).更新该MapPoint平均观测方向以及观测距离的范围 pMP->UpdateNormalAndDepth(); //Fill Current Frame structure mCurrentFrame.mvpMapPoints[mvIniMatches[i]] = pMP; mCurrentFrame.mvbOutlier[mvIniMatches[i]] = false; //Add to Map //将mappoint插入到地图中 mpMap->AddMapPoint(pMP); } // Update Connections //4：更新关键帧间的连接关系 //在3D点和关键帧之间建立边，每个边有一个权重，边的权重是该关键帧与当前帧公共3D点的个数 pKFini->UpdateConnections(); pKFcur->UpdateConnections(); // Bundle Adjustment //mpMap->MapPointsInMap()返回的是map中的mappoint数目 cout << "New Map created with " << mpMap->MapPointsInMap() << " points" << endl; //5：进行BA优化 Optimizer::GlobalBundleAdjustemnt(mpMap,20); // Set median depth to 1 //6：将MapPoints的中值深度归一化到1，并归一化两帧之间变换,评估关键帧场景深度，q=2表示中值 float medianDepth = pKFini->ComputeSceneMedianDepth(2); float invMedianDepth = 1.0f/medianDepth; if(medianDepth<0 || pKFcur->TrackedMapPoints(1)<100) { cout << "Wrong initialization, reseting..." << endl; Reset(); return; } // Scale initial baseline cv::Mat Tc2w = pKFcur->GetPose(); //将Tc2w中的平移向量t进行了修改 Tc2w.col(3).rowRange(0,3) = Tc2w.col(3).rowRange(0,3)*invMedianDepth; pKFcur->SetPose(Tc2w); // Scale points vector<MapPoint*> vpAllMapPoints = pKFini->GetMapPointMatches(); for(size_t iMP=0; iMP<vpAllMapPoints.size(); iMP++) { if(vpAllMapPoints[iMP]) { MapPoint* pMP = vpAllMapPoints[iMP]; pMP->SetWorldPos(pMP->GetWorldPos()*invMedianDepth); } } //往LocalMapper中插入关键帧 mpLocalMapper->InsertKeyFrame(pKFini); mpLocalMapper->InsertKeyFrame(pKFcur); mCurrentFrame.SetPose(pKFcur->GetPose()); mnLastKeyFrameId=mCurrentFrame.mnId; mpLastKeyFrame = pKFcur; mvpLocalKeyFrames.push_back(pKFcur); mvpLocalKeyFrames.push_back(pKFini); mvpLocalMapPoints=mpMap->GetAllMapPoints(); //将当前关键帧设置为参考关键帧 mpReferenceKF = pKFcur; mCurrentFrame.mpReferenceKF = pKFcur; //更新mLastFrame mLastFrame = Frame(mCurrentFrame); mpMap->SetReferenceMapPoints(mvpLocalMapPoints); //设置当前帧的相机位姿 mpMapDrawer->SetCurrentCameraPose(pKFcur->GetPose()); mpMap->mvpKeyFrameOrigins.push_back(pKFini); //初始化完成，修改标记值 mState=OK; } /** * 更新当前帧与其他帧之间的连接关系 */ void KeyFrame::UpdateConnections() { map<KeyFrame*,int> KFcounter; vector<MapPoint*> vpMP; { unique_lock<mutex> lockMPs(mMutexFeatures); vpMP = mvpMapPoints; } //For all map points in keyframe check in which other keyframes are they seen //Increase counter for those keyframes //遍历当前关键帧所能观察到的mapPoint for(vector<MapPoint*>::iterator vit=vpMP.begin(), vend=vpMP.end(); vit!=vend; vit++) { MapPoint* pMP = *vit; if(!pMP) continue; if(pMP->isBad()) continue; //observations表示pMP这个mapPoint可以被哪些关键帧观测到 map<KeyFrame*,size_t> observations = pMP->GetObservations(); //遍历pMP这个mapPoint对应的所有关键帧 for(map<KeyFrame*,size_t>::iterator mit=observations.begin(), mend=observations.end(); mit!=mend; mit++) { //判断该关键帧的id是否和当前关键帧id相同，并统计不相同的个数 if(mit->first->mnId==mnId) continue; //KFcounter的大小是除过当前关键帧之外的能够观察到这个mapPoint的关键帧的个数 //这里是对KFcounter.second进行+1操作，最后的second值为和当前帧共视的mappoint的个数 KFcounter[mit->first]++; } } /** * KFcounter中保存了和当前帧有共视同一个mappoint的keyframe的个数 * */ // This should not happen if(KFcounter.empty()) return; //If the counter is greater than threshold add connection //In case no keyframe counter is over threshold add the one with maximum counter int nmax=0; KeyFrame* pKFmax=NULL; int th = 15; vector<pair<int,KeyFrame*> > vPairs; vPairs.reserve(KFcounter.size()); //遍历KFcounter中统计的结果 for(map<KeyFrame*,int>::iterator mit=KFcounter.begin(), mend=KFcounter.end(); mit!=mend; mit++) { if(mit->second>nmax) { nmax=mit->second;//second是mappoint的个数 pKFmax=mit->first;//first是个keyframe } //表示mit->first这个关键帧和当前关键帧匹配的mappoint大于15 if(mit->second>=th) { //vPairs中存放的为<匹配的mappoint个数，关键帧> vPairs.push_back(make_pair(mit->second,mit->first)); (mit->first)->AddConnection(this,mit->second); } } if(vPairs.empty()) { vPairs.push_back(make_pair(nmax,pKFmax)); pKFmax->AddConnection(this,nmax); } //进行排序 sort(vPairs.begin(),vPairs.end()); list<KeyFrame*> lKFs; list<int> lWs; for(size_t i=0; i<vPairs.size();i++) { lKFs.push_front(vPairs[i].second); lWs.push_front(vPairs[i].first); } { unique_lock<mutex> lockCon(mMutexConnections); // mspConnectedKeyFrames = spConnectedKeyFrames; //mConnectedKeyFrameWeights中存储的是哪些keyframe和当前帧可以共同观察到的mappoint的个数 mConnectedKeyFrameWeights = KFcounter; //和当前关键帧存在共同观察到的mappoint的关键帧 mvpOrderedConnectedKeyFrames = vector<KeyFrame*>(lKFs.begin(),lKFs.end()); //可以共同观察到的mappoint的数量 mvOrderedWeights = vector<int>(lWs.begin(), lWs.end()); //mbFirstConnection值在初始化KeyFrame的时候置为true if(mbFirstConnection && mnId!=0) { mpParent = mvpOrderedConnectedKeyFrames.front(); mpParent->AddChild(this); mbFirstConnection = false; } } }

这里还调用到了位姿优化函数Optimizer::GlobalBundleAdjustemnt(mpMap,20); 优化是个很大的一部分，后边结合代码专门写篇笔记整理下。

3.小结

本篇笔记主要梳理了下图中画框的部分的代码。当然，有些部分还没有梳理，比如优化、单应矩阵分解等，后边再进行梳理。

最新回复(0)