本项目实现需要voicebox模块,附网址: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
1、特征提取
声纹识别中常用到的特征主要有MFCC、和LPC。本文采取的MFCC特征。
function
[ mfcc_feature
] = get_features( voice_data
, fs
)
%GET_FEATURES 提取语音信号的MFCC特征
a
= 0.92; %预加重系数
0.9 < a
< 1。
voice_data
= filter([1 - a
],1,voice_data
);%预加重
mfcc_feature
= melcepst(voice_data
, fs
); % 提取MFCC特征
end
2、主函数
%% 初始化
GMM_order
= 10;
train_path
= './train';
test_path
= './test'
train_info
= dir(train_path
);
n_speakers
= length(train_info
) - 2;
test_info
= dir(strcat(test_path
, '/*.wav'));
%% MFCC特征
features
= cell(1,n_speakers
);
for i
=1:n_speakers
tem_info
= dir(strcat(train_info(2 + i
).folder
, '/', train_info(2 + i
).name
));
for j
=1:length(tem_info
) - 2
[voice_data
, Fs
] = audioread(strcat(tem_info(2 + j
).folder
, '/', tem_info(2 + j
).name
));
if j
==1
mfcc_features
= get_features(voice_data
, Fs
);
else
mfcc_features
= [mfcc_features
;get_features(voice_data
, Fs
)];
end
features
{i
} = mfcc_features
;
end
end
%%
%模型训练
GMModels
= cell(1, n_speakers
);
options
= struct('MaxIter',{2000});
epochs
= 10;
for i
=1:n_speakers
GMModels
{i
} = fitgmdist(features
{i
}, GMM_order
, 'RegularizationValue', 0.001, 'SharedCov', true, 'Options', options
, 'Start', 'plus', 'Replicates', epochs
);
end
%% 测试过程
for i
=1:length(test_info
)
[voice_data
, Fs
] = audioread(strcat(test_info(i
).folder
, '/', test_info(i
).name
));
mfcc_features
= get_features(voice_data
, Fs
);
[d1
, log1
] = posterior(GMModels
{1}, mfcc_features
);
[d2
, log2
] = posterior(GMModels
{2}, mfcc_features
);
if log1
< log2
fprintf(test_info(i
).name
);fprintf(' label: 1');
fprintf('\n');
else
fprintf(test_info(i
).name
);fprintf(' label: 2');
fprintf('\n');
end
end
文件的路径设置如下
train文件夹下要包含各个说话人的文件夹,每个文件夹独立存在且包含各自的.wav训练语音。
test文件夹下包含各个待识别的.wav文件。