Table of Contents
1.特征点匹配相关理论简介
2.ORB-SLAM2中特征匹配代码分析
(1)Tracking线程中的状态机
(2)单目相机初始化函数MonocularInitialization代码分析
(3)ORBmatcher::SearchForInitialization函数代码分析
(4)mpInitializer->Initialize函数代码分析
1)单应矩阵计算相机位姿
2)基础矩阵计算相机位姿
(5)CreateInitialMapMonocular函数代码分析
3.小结
在 ORB-SLAM2代码阅读笔记(四):Tracking线程2——ORB特征提取 中已经分析了从图像中提取关键点和描述子的代码流程,接下来就是对提取的特征点进行匹配。特征点匹配简单来说,就是把两帧图片中提取的特征进行对比,找到图片一中的特征点在图片二中相对应的特征点。这样一帧一帧的图片之间才有了关联,图片帧也才有了意义。
因为系统中要求只有输入的前两帧图中提取的特征点数都大于100,才能进行前期的初始化。结合实际看看,只有前边输入的两帧图像提取的特征够多,这时候才能去进行匹配进而计算出这两帧图像之间的相机位姿变换,后期输入的图像的位姿都是在此基础上进行估算的,所以说前两帧图像的初始化也就是为系统设定了一些初始值(正所谓有了个初始状态)。
ORB-SLAM2中的特征点匹配分为关键点匹配和描述子匹配:
1)关键点匹配:ORB-SLAM2将图像提取的特征值“比较均匀”的分布在了48*64的网格中,这样当得到F1帧中的特征点坐标(x,y)后,就可以计算该(x,y)坐标在F2的的特征点网格中的哪个网格内。系统中的计算方法是在以(x,y)为中心半径为100的范围内进行匹配,查找F2帧在这个范围内和(x,y)距离最小于半径100的特征点,找到的都算作F1帧中坐标为(x,y)的特征点在F2帧中潜在的匹配点。这些潜在的匹配点会存放在一个列表中。
2)描述子匹配:在关键点匹配的基础上进行描述子匹配,此时遍历1)中得到的潜在的特征匹配点列表,计算每个特征点的描述子和F1中(x,y)所在的特征点描述子的距离,记录距离最小和次小的两个特征点。同时,最小距离要小于TH_LOW(代码中值为50),并且要小于次小距离的0.9倍的距离,才算描述子基本匹配成功。最后,还要进行描述子方向的比较,只有描述子方向也相同才认为描述子匹配成功。
ORB-SLAM2这种划分网格的匹配方式比遍历一个个关键点进行暴力匹配的效率要高很多。并且分了关键点匹配和描述子匹配两个步骤,这样在关键点匹配完成后,就已经过滤了许多特征点,再进行描述子匹配就会轻松好多。
所以,ORB-SLAM2中特征点匹配成功的条件是:关键点匹配成功&&描述子距离符合要求&&描述子方向检测。
本部分的代码入口函数为Tracking.cc中的Track()函数。
Track函数代码按照流程主要分为3步:
1)第一次调用Track函数时,根据输入帧进行初始化。主要是对前两帧图像初始化和特征点匹配;
2)当初始化完成后,再有图像帧传入的时候,进行相机跟踪也就是位姿估计;
3)存储根据当前帧计算出的相机位姿(也就是变换矩阵等数据);
Track函数代码大体结构如下图所示:
Tracking中的状态机如下:
SYSTEM_NOT_READY=-1, //系统初始化开始前为FrameDrawer类对象中的mState赋值为SYSTEM_NOT_READY NO_IMAGES_YET=0, //如果图片复位过或者第一次运行,则为NO_IMAGES_YET NOT_INITIALIZED=1,//系统未初始化好时的状态标记 OK=2,//系统初始化完成或者重定位后会标记为OK状态 LOST=3 //局部地图跟踪失败的情况下标记为该状态,系统中为TrackLocalMap函数调用返回false的情况下给mState重置该状态MonocularInitialization()函数中进行两帧间特征点匹配的函数为下面这行代码,调用ORBmatcher中的SearchForInitialization函数。
//计算当前帧和初始化帧之间的匹配关系 /** * step 3:在mInitialFrame与mCurrentFrame中找匹配的特征点对 * mvbPrevMatched为前一帧的特征点的坐标,存储了mInitialFrame中哪些点将进行接下来的匹配 * mvIniMatches用于存储mInitialFrame,mCurrentFrame之间匹配的特征点 */ int nmatches = matcher.SearchForInitialization(mInitialFrame,mCurrentFrame,mvbPrevMatched,mvIniMatches,100);GetFeaturesInArea函数代码解析如下:
/** * 根据选取的初始帧的特征点的位置(x,y)作为输入,然后在第二帧该位置半径r的范围内,寻找可能匹配的点,注意这边的匹配只考虑了原始图像即尺度图像的第一层的特征 * 思路: * 1.从初始化帧的关键点坐标(x,y)计算出该关键点在网格中半径为r的范围; * 2.遍历这个范围内的网格,获取当前帧在这个范围内的关键点的坐标; * 3.并计算该坐标关键点距离初始化关键点(x,y)的距离,<r表示匹配,将匹配的关键点index存入vIndices中; * 输出:vIndices中存的是距离<r的关键点的index,也就表示匹配的关键点的index */ vector<size_t> Frame::GetFeaturesInArea(const float &x, const float &y, const float &r, const int minLevel, const int maxLevel) const { vector<size_t> vIndices; vIndices.reserve(N); //计算以关键点坐标(x,y)为中心,半径为r的的区域在48*64的网格中的位置 const int nMinCellX = max(0,(int)floor((x-mnMinX-r)*mfGridElementWidthInv)); if(nMinCellX>=FRAME_GRID_COLS) return vIndices; const int nMaxCellX = min((int)FRAME_GRID_COLS-1,(int)ceil((x-mnMinX+r)*mfGridElementWidthInv)); if(nMaxCellX<0) return vIndices; const int nMinCellY = max(0,(int)floor((y-mnMinY-r)*mfGridElementHeightInv)); if(nMinCellY>=FRAME_GRID_ROWS) return vIndices; const int nMaxCellY = min((int)FRAME_GRID_ROWS-1,(int)ceil((y-mnMinY+r)*mfGridElementHeightInv)); if(nMaxCellY<0) return vIndices; const bool bCheckLevels = (minLevel>0) || (maxLevel>=0); //遍历mGrid网格,这个网格中存放的是特征点坐标值 //查找当前帧的关键点中和传入的上个帧的关键点(x,y)距离<r的关键点 for(int ix = nMinCellX; ix<=nMaxCellX; ix++) { for(int iy = nMinCellY; iy<=nMaxCellY; iy++) { //从mGrid中获取当前Frame网格里的关键点的index值,可以通过对应的index在mvKeysUn中获取关键点 const vector<size_t> vCell = mGrid[ix][iy]; if(vCell.empty()) continue; for(size_t j=0, jend=vCell.size(); j<jend; j++) { //vCell[j]获取的是关键点在mvKeysUn中的index值 const cv::KeyPoint &kpUn = mvKeysUn[vCell[j]]; if(bCheckLevels) { if(kpUn.octave<minLevel) continue; if(maxLevel>=0) if(kpUn.octave>maxLevel) continue; } //(x,y)为初始化帧的关键点的坐标,kpUn.pt.x、kpUn.pt.y为当前帧的关键点 const float distx = kpUn.pt.x-x; const float disty = kpUn.pt.y-y; if(fabs(distx)<r && fabs(disty)<r) vIndices.push_back(vCell[j]); } } } return vIndices; }
计算单应矩阵的线程入口函数如下:
/** * 计算单应矩阵 * */ void Initializer::FindHomography(vector<bool> &vbMatchesInliers, float &score, cv::Mat &H21) { // Number of putative matches //mvMatches12中存放的pair里的i为前一帧的关键点的index,vMatches12[i]为CurrentFrame中与i想匹配的关键点的index const int N = mvMatches12.size(); // Normalize coordinates vector<cv::Point2f> vPn1, vPn2; cv::Mat T1, T2; //mvKeys1为参考帧的关键点,mvKeys2为当前帧的关键点 Normalize(mvKeys1,vPn1, T1); Normalize(mvKeys2,vPn2, T2); //计算T2的逆 cv::Mat T2inv = T2.inv(); // Best Results variables score = 0.0; vbMatchesInliers = vector<bool>(N,false); // Iteration variables vector<cv::Point2f> vPn1i(8); vector<cv::Point2f> vPn2i(8); cv::Mat H21i, H12i; vector<bool> vbCurrentInliers(N,false); float currentScore; // Perform all RANSAC iterations and save the solution with highest score //mMaxIterations值为200 for(int it=0; it<mMaxIterations; it++) { // Select a minimum set //每次取8对点在ComputeH21中进行计算 for(size_t j=0; j<8; j++) { int idx = mvSets[it][j]; //获取匹配的关键点对中参考帧里的index值 vPn1i[j] = vPn1[mvMatches12[idx].first]; //获取匹配的关键点对中当前帧里的index值 vPn2i[j] = vPn2[mvMatches12[idx].second]; } //根据匹配的8对点计算 cv::Mat Hn = ComputeH21(vPn1i,vPn2i); H21i = T2inv*Hn*T1; H12i = H21i.inv(); currentScore = CheckHomography(H21i, H12i, vbCurrentInliers, mSigma); if(currentScore>score) { H21 = H21i.clone(); vbMatchesInliers = vbCurrentInliers; score = currentScore; } } } //8对点法计算单应矩阵 cv::Mat Initializer::ComputeH21(const vector<cv::Point2f> &vP1, const vector<cv::Point2f> &vP2) { //因为是8点法计算,所以这里的N值为8 const int N = vP1.size(); //A为16*9的矩阵 cv::Mat A(2*N,9,CV_32F); //遍历这8对点 for(int i=0; i<N; i++) { const float u1 = vP1[i].x; const float v1 = vP1[i].y; const float u2 = vP2[i].x; const float v2 = vP2[i].y; A.at<float>(2*i,0) = 0.0; A.at<float>(2*i,1) = 0.0; A.at<float>(2*i,2) = 0.0; A.at<float>(2*i,3) = -u1; A.at<float>(2*i,4) = -v1; A.at<float>(2*i,5) = -1; A.at<float>(2*i,6) = v2*u1; A.at<float>(2*i,7) = v2*v1; A.at<float>(2*i,8) = v2; A.at<float>(2*i+1,0) = u1; A.at<float>(2*i+1,1) = v1; A.at<float>(2*i+1,2) = 1; A.at<float>(2*i+1,3) = 0.0; A.at<float>(2*i+1,4) = 0.0; A.at<float>(2*i+1,5) = 0.0; A.at<float>(2*i+1,6) = -u2*u1; A.at<float>(2*i+1,7) = -u2*v1; A.at<float>(2*i+1,8) = -u2; } cv::Mat u,w,vt; //进行SVD分解 cv::SVDecomp(A,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV); return vt.row(8).reshape(0, 3); }从单应矩阵计算相机位姿:
bool Initializer::ReconstructH(vector<bool> &vbMatchesInliers, cv::Mat &H21, cv::Mat &K, cv::Mat &R21, cv::Mat &t21, vector<cv::Point3f> &vP3D, vector<bool> &vbTriangulated, float minParallax, int minTriangulated) { int N=0; for(size_t i=0, iend = vbMatchesInliers.size() ; i<iend; i++) if(vbMatchesInliers[i]) N++; // We recover 8 motion hypotheses using the method of Faugeras et al. // Motion and structure from motion in a piecewise planar environment. // International Journal of Pattern Recognition and Artificial Intelligence, 1988 cv::Mat invK = K.inv(); cv::Mat A = invK*H21*K; cv::Mat U,w,Vt,V; //SVD分解 cv::SVD::compute(A,w,U,Vt,cv::SVD::FULL_UV); V=Vt.t(); float s = cv::determinant(U)*cv::determinant(Vt); float d1 = w.at<float>(0); float d2 = w.at<float>(1); float d3 = w.at<float>(2); if(d1/d2<1.00001 || d2/d3<1.00001) { return false; } vector<cv::Mat> vR, vt, vn; vR.reserve(8); vt.reserve(8); vn.reserve(8); //n'=[x1 0 x3] 4 posibilities e1=e3=1, e1=1 e3=-1, e1=-1 e3=1, e1=e3=-1 float aux1 = sqrt((d1*d1-d2*d2)/(d1*d1-d3*d3)); float aux3 = sqrt((d2*d2-d3*d3)/(d1*d1-d3*d3)); float x1[] = {aux1,aux1,-aux1,-aux1}; float x3[] = {aux3,-aux3,aux3,-aux3}; //case d'=d2 float aux_stheta = sqrt((d1*d1-d2*d2)*(d2*d2-d3*d3))/((d1+d3)*d2); float ctheta = (d2*d2+d1*d3)/((d1+d3)*d2); float stheta[] = {aux_stheta, -aux_stheta, -aux_stheta, aux_stheta}; for(int i=0; i<4; i++) { cv::Mat Rp=cv::Mat::eye(3,3,CV_32F); Rp.at<float>(0,0)=ctheta; Rp.at<float>(0,2)=-stheta[i]; Rp.at<float>(2,0)=stheta[i]; Rp.at<float>(2,2)=ctheta; cv::Mat R = s*U*Rp*Vt; vR.push_back(R); cv::Mat tp(3,1,CV_32F); tp.at<float>(0)=x1[i]; tp.at<float>(1)=0; tp.at<float>(2)=-x3[i]; tp*=d1-d3; cv::Mat t = U*tp; vt.push_back(t/cv::norm(t)); cv::Mat np(3,1,CV_32F); np.at<float>(0)=x1[i]; np.at<float>(1)=0; np.at<float>(2)=x3[i]; cv::Mat n = V*np; if(n.at<float>(2)<0) n=-n; vn.push_back(n); } //case d'=-d2 float aux_sphi = sqrt((d1*d1-d2*d2)*(d2*d2-d3*d3))/((d1-d3)*d2); float cphi = (d1*d3-d2*d2)/((d1-d3)*d2); float sphi[] = {aux_sphi, -aux_sphi, -aux_sphi, aux_sphi}; for(int i=0; i<4; i++) { cv::Mat Rp=cv::Mat::eye(3,3,CV_32F); Rp.at<float>(0,0)=cphi; Rp.at<float>(0,2)=sphi[i]; Rp.at<float>(1,1)=-1; Rp.at<float>(2,0)=sphi[i]; Rp.at<float>(2,2)=-cphi; cv::Mat R = s*U*Rp*Vt; vR.push_back(R); cv::Mat tp(3,1,CV_32F); tp.at<float>(0)=x1[i]; tp.at<float>(1)=0; tp.at<float>(2)=x3[i]; tp*=d1+d3; cv::Mat t = U*tp; vt.push_back(t/cv::norm(t)); cv::Mat np(3,1,CV_32F); np.at<float>(0)=x1[i]; np.at<float>(1)=0; np.at<float>(2)=x3[i]; cv::Mat n = V*np; if(n.at<float>(2)<0) n=-n; vn.push_back(n); } int bestGood = 0; int secondBestGood = 0; int bestSolutionIdx = -1; float bestParallax = -1; vector<cv::Point3f> bestP3D; vector<bool> bestTriangulated; // Instead of applying the visibility constraints proposed in the Faugeras' paper (which could fail for points seen with low parallax) // We reconstruct all hypotheses and check in terms of triangulated points and parallax for(size_t i=0; i<8; i++) { float parallaxi; vector<cv::Point3f> vP3Di; vector<bool> vbTriangulatedi; int nGood = CheckRT(vR[i],vt[i],mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K,vP3Di, 4.0*mSigma2, vbTriangulatedi, parallaxi); if(nGood>bestGood) { secondBestGood = bestGood; bestGood = nGood; bestSolutionIdx = i; bestParallax = parallaxi; bestP3D = vP3Di; bestTriangulated = vbTriangulatedi; } else if(nGood>secondBestGood) { secondBestGood = nGood; } } if(secondBestGood<0.75*bestGood && bestParallax>=minParallax && bestGood>minTriangulated && bestGood>0.9*N) { vR[bestSolutionIdx].copyTo(R21); vt[bestSolutionIdx].copyTo(t21); vP3D = bestP3D; vbTriangulated = bestTriangulated; return true; } return false; }计算基础矩阵的线程入口函数如下:
/** * 计算基础矩阵 * */ void Initializer::FindFundamental(vector<bool> &vbMatchesInliers, float &score, cv::Mat &F21) { // Number of putative matches const int N = vbMatchesInliers.size(); // Normalize coordinates vector<cv::Point2f> vPn1, vPn2; cv::Mat T1, T2; Normalize(mvKeys1,vPn1, T1); Normalize(mvKeys2,vPn2, T2); cv::Mat T2t = T2.t(); // Best Results variables score = 0.0; vbMatchesInliers = vector<bool>(N,false); // Iteration variables vector<cv::Point2f> vPn1i(8); vector<cv::Point2f> vPn2i(8); cv::Mat F21i; vector<bool> vbCurrentInliers(N,false); float currentScore; // Perform all RANSAC iterations and save the solution with highest score for(int it=0; it<mMaxIterations; it++) { // Select a minimum set for(int j=0; j<8; j++) { int idx = mvSets[it][j]; vPn1i[j] = vPn1[mvMatches12[idx].first]; vPn2i[j] = vPn2[mvMatches12[idx].second]; } cv::Mat Fn = ComputeF21(vPn1i,vPn2i); F21i = T2t*Fn*T1; currentScore = CheckFundamental(F21i, vbCurrentInliers, mSigma); if(currentScore>score) { F21 = F21i.clone(); vbMatchesInliers = vbCurrentInliers; score = currentScore; } } } /** * 计算基础矩阵 * 本质矩阵为E,vP2的坐标为(u2,v2,1),vP1的坐标为(u1,v1,1) * 公式vP2*E*vP1=0成立,展开来就是下面公式: * |e1 e2 e3| |u1| * (u2, v1, 1) * |e4 e5 e6| *|v1| = 0 * |e7 e8 e9| |1 | * 计算整理可得: * u2*u1*e1 + u2*v1*e2 + u2*e3 + v2*u1*e4 + v2*v1*e5 + v2*e6 + u1*e7 + v1*e8 + 1*e9 = 0 * 将E作为向量提取出来可得: * (u2*u1 + u2*v1 + u2 + v2*u1 + v2*v1 + v2 + u1 + v1 + 1) * E = 0 * 这个函数中构造的A矩阵就是根据这个来构造的,可以求出本质矩阵E,进而根据公式F=K^-T*E*K^-1来计算出F * */ cv::Mat Initializer::ComputeF21(const vector<cv::Point2f> &vP1,const vector<cv::Point2f> &vP2) { const int N = vP1.size(); cv::Mat A(N,9,CV_32F); for(int i=0; i<N; i++) { const float u1 = vP1[i].x; const float v1 = vP1[i].y; const float u2 = vP2[i].x; const float v2 = vP2[i].y; A.at<float>(i,0) = u2*u1; A.at<float>(i,1) = u2*v1; A.at<float>(i,2) = u2; A.at<float>(i,3) = v2*u1; A.at<float>(i,4) = v2*v1; A.at<float>(i,5) = v2; A.at<float>(i,6) = u1; A.at<float>(i,7) = v1; A.at<float>(i,8) = 1; } cv::Mat u,w,vt; cv::SVDecomp(A,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV); cv::Mat Fpre = vt.row(8).reshape(0, 3); cv::SVDecomp(Fpre,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV); w.at<float>(2)=0; return u*cv::Mat::diag(w)*vt; }从基础矩阵计算相机位姿:
/** * 思路:利用基础矩阵计算出本质矩阵,再从本质矩阵中分解出旋转和平移矩阵 * */ bool Initializer::ReconstructF(vector<bool> &vbMatchesInliers, cv::Mat &F21, cv::Mat &K, cv::Mat &R21, cv::Mat &t21, vector<cv::Point3f> &vP3D, vector<bool> &vbTriangulated, float minParallax, int minTriangulated) { int N=0; for(size_t i=0, iend = vbMatchesInliers.size() ; i<iend; i++) if(vbMatchesInliers[i]) N++; // Compute Essential Matrix from Fundamental Matrix /** * 根据基础矩阵来计算本质矩阵 * E = t^ * R,F = K^-T * E * K^-1 (K^-T为K矩阵转置的逆矩阵,K^-1为K的逆矩阵) * 此时F左乘K^T,右成K,可得K^T * F * K = E,下面的本质矩阵E21就是这么来计算的 * */ cv::Mat E21 = K.t()*F21*K; cv::Mat R1, R2, t; // Recover the 4 motion hypotheses //利用本质矩阵E21来恢复四种情况的运动假设 DecomposeE(E21,R1,R2,t); cv::Mat t1=t; cv::Mat t2=-t; // Reconstruct with the 4 hyphoteses and check vector<cv::Point3f> vP3D1, vP3D2, vP3D3, vP3D4; vector<bool> vbTriangulated1,vbTriangulated2,vbTriangulated3, vbTriangulated4; float parallax1,parallax2, parallax3, parallax4; //对计算出来的4种解进行相机深度值的检查,判断标准是关键点在两个帧中计算出的深度都要为正值 int nGood1 = CheckRT(R1,t1,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D1, 4.0*mSigma2, vbTriangulated1, parallax1); int nGood2 = CheckRT(R2,t1,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D2, 4.0*mSigma2, vbTriangulated2, parallax2); int nGood3 = CheckRT(R1,t2,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D3, 4.0*mSigma2, vbTriangulated3, parallax3); int nGood4 = CheckRT(R2,t2,mvKeys1,mvKeys2,mvMatches12,vbMatchesInliers,K, vP3D4, 4.0*mSigma2, vbTriangulated4, parallax4); int maxGood = max(nGood1,max(nGood2,max(nGood3,nGood4))); R21 = cv::Mat(); t21 = cv::Mat(); int nMinGood = max(static_cast<int>(0.9*N),minTriangulated); int nsimilar = 0; if(nGood1>0.7*maxGood) nsimilar++; if(nGood2>0.7*maxGood) nsimilar++; if(nGood3>0.7*maxGood) nsimilar++; if(nGood4>0.7*maxGood) nsimilar++; // If there is not a clear winner or not enough triangulated points reject initialization if(maxGood<nMinGood || nsimilar>1) { return false; } // If best reconstruction has enough parallax initialize if(maxGood==nGood1) { if(parallax1>minParallax) { vP3D = vP3D1; vbTriangulated = vbTriangulated1; R1.copyTo(R21); t1.copyTo(t21); return true; } }else if(maxGood==nGood2) { if(parallax2>minParallax) { vP3D = vP3D2; vbTriangulated = vbTriangulated2; R2.copyTo(R21); t1.copyTo(t21); return true; } }else if(maxGood==nGood3) { if(parallax3>minParallax) { vP3D = vP3D3; vbTriangulated = vbTriangulated3; R1.copyTo(R21); t2.copyTo(t21); return true; } }else if(maxGood==nGood4) { if(parallax4>minParallax) { vP3D = vP3D4; vbTriangulated = vbTriangulated4; R2.copyTo(R21); t2.copyTo(t21); return true; } } return false; } //E=t^R 对E进行分解,求出t和R void Initializer::DecomposeE(const cv::Mat &E, cv::Mat &R1, cv::Mat &R2, cv::Mat &t) { cv::Mat u,w,vt; cv::SVD::compute(E,w,u,vt); u.col(2).copyTo(t); t=t/cv::norm(t); cv::Mat W(3,3,CV_32F,cv::Scalar(0)); W.at<float>(0,1)=-1; W.at<float>(1,0)=1; W.at<float>(2,2)=1; R1 = u*W*vt; if(cv::determinant(R1)<0) R1=-R1; R2 = u*W.t()*vt; if(cv::determinant(R2)<0) R2=-R2; }这里还调用到了位姿优化函数Optimizer::GlobalBundleAdjustemnt(mpMap,20); 优化是个很大的一部分,后边结合代码专门写篇笔记整理下。
本篇笔记主要梳理了下图中画框的部分的代码。当然,有些部分还没有梳理,比如优化、单应矩阵分解等,后边再进行梳理。