Shilei Zhang, Yong Qin
ICASSP 2012
Modern speech applications utilize acoustic models with billions of parameters, and serve millions of users. Storing an acoustic model for each user is costly. We show through the use of sparse regularization, that it is possible to obtain competitive adaptation performance by changing only a small fraction of the parameters of an acoustic model. This allows for the compression of speaker-dependent models: a capability that has important implications for systems with millions of users. We achieve a performance comparable to the best Maximum A Posteriori (MAP) adaptation models while only adapting 5% of the acoustic model parameters. Thus it is possible to compress the speaker dependent acoustic models by close to a factor of 20. The proposed sparse adaptation criterion improves three aspects of previous work: It combines ℓ 0 and ℓ 1 penalties, have different adaptation rates for mean and variance parameters and is invariant to affine transformations. © 2012 IEEE.
Shilei Zhang, Yong Qin
ICASSP 2012
John Z. Sun, Kush R. Varshney, et al.
ICASSP 2012
Bhuvana Ramabhadran, Jing Huang, et al.
INTERSPEECH - Eurospeech 2003
Jia-Yu Chen, Peder A. Olsen, et al.
INTERSPEECH 2007