Stereo-based stochastic mapping with context using probabilistic PCA for noise robust automatic speech recognition
Abstract
In this paper we investigate stereo-based stochastic mapping (SSM) with context for the noise robustness of automatic speech recognition, especially under unseen conditions. Probabilistic PCA (PPCA) is used in the SSM framework to reduce the high dimensionality of the noisy speech features with context and derive an eigen representation in the noisy feature space for the prediction of clean features. To reduce the computational cost in training, an approximation by single-pass re-training is considered for the estimation of joint GMM. We also show that the SSM estimate under the minimum mean square error (MMSE) in a space where low dimensional representation of clean speech and uncorrelated additive noise can be assumed is related to the subspace speech enhancement. Experiments on large vocabulary continuous speech recognition tasks observe gains from the proposed approach under the conditions with seen, unseen and real noise. © 2012 IEEE.