Greater dimensionality always brings about more difficult learning tasks. Here we introduce a supervised dimension reduction method based on linear dimension reduction as introduced in
http://blog.csdn.net/philthinker/article/details/70212147
which can also be simplified as:
z=Tx,x∈Rd,z∈Rm,m<d Of course, centeralization in the first place is necessary: xi←xi−1n∑i′=1nxi′Fisher discrimination analysis is one of the most basic supervised linear dimension reduction methods, where we seek for a T to make samples of the same label as close as possible and vice versa. To begin with, define within-class class matrix S(w) and between-class matrix S(b) as:
S(w)=∑y=1c∑i:yi=y(xi−μy)(xi−μy)T∈R(d×d)S(b)=∑y=1cnyμyμTy∈R(d×d) where μy=1ny∑i:yi=yxi ∑i:yi=y stands for the sum of y satisfying yi=y, ny is the amount of samples belonging to class y. Then we can define the projection matrix T: maxT∈Rm×dtr((TS(w)TT)−1TS(b)TT) It is obvious that our optimization goal is trying to maximize within-class matrix TS(w)TT as well as minimize between-class matrix TS(b)TT. This optimization problem is solvable once we carry out some approaches similar to the one used in Unsupervised Dimension Reduction, i.e. S(b)ξ=λS(w)ξ where the normalized eigenvalues are λ1≥⋯≥λd≥0 and corresponded eigen-vectors are ξ1,⋯,ξd. Taking the largest m eigenvalues we get the solution of T: Tˆ=(ξ1,⋯,ξm)T n=100; x=randn(n,2); x(1:n/2,1)=x(1:n/2,1)-4; x(n/2+1:end,1)=x(n/2+1:end, 1)+4; x=x-repmat(mean(x),[n,1]); y=[ones(n/2,1);2*ones(n/2,1)]; m1=mean(x(y==1,:)); x1=x(y==1,:)-repmat(m1,[n/2,1]); m2=mean(x(y==2,:)); x2=x(y==2,:)-repmat(m2,[n/2,1]); [t,v]=eigs(n/2*(m1')*m1+n/2*(m2')*m2,x1'*x1+x2'*x2,1); figure(1); clf; hold on; axis([-8 8 -6 6]); plot(x(y==1,1),x(y==1,2),'bo'); plot(x(y==2,1),x(y==2,2),'rx'); plot(99*[-t(1) t(1)],99*[-t(2) t(2)],'k-');Attention please: when samples have several peeks, the output fails to be ideal. Local Fisher Discrimination Analysis may work yet.
相关资源:基于局部Fisher判别分析故障检测与诊断