Shilei Zhang, Yong Qin
ICASSP 2012
In this paper we investigate stereo-based stochastic mapping (SSM) with context for the noise robustness of automatic speech recognition, especially under unseen conditions. Probabilistic PCA (PPCA) is used in the SSM framework to reduce the high dimensionality of the noisy speech features with context and derive an eigen representation in the noisy feature space for the prediction of clean features. To reduce the computational cost in training, an approximation by single-pass re-training is considered for the estimation of joint GMM. We also show that the SSM estimate under the minimum mean square error (MMSE) in a space where low dimensional representation of clean speech and uncorrelated additive noise can be assumed is related to the subspace speech enhancement. Experiments on large vocabulary continuous speech recognition tasks observe gains from the proposed approach under the conditions with seen, unseen and real noise. © 2012 IEEE.
Shilei Zhang, Yong Qin
ICASSP 2012
Mingbo Ma, Liang Huang, et al.
ACL 2017
Bowen Zhou, Bing Xiang, et al.
SSST 2008
Mo Yu, Wenpeng Yin, et al.
ACL 2017