Affine invariant sparse maximum a posteriori adaptation

Peder A. Olsen; Jing Huang; Steven J. Rennie; Vaibhava Goel

doi:10.1109/ICASSP.2012.6288874

ICASSP 2012

Conference paper

23 Oct 2012

Affine invariant sparse maximum a posteriori adaptation

View publication

Abstract

Modern speech applications utilize acoustic models with billions of parameters, and serve millions of users. Storing an acoustic model for each user is costly. We show through the use of sparse regularization, that it is possible to obtain competitive adaptation performance by changing only a small fraction of the parameters of an acoustic model. This allows for the compression of speaker-dependent models: a capability that has important implications for systems with millions of users. We achieve a performance comparable to the best Maximum A Posteriori (MAP) adaptation models while only adapting 5% of the acoustic model parameters. Thus it is possible to compress the speaker dependent acoustic models by close to a factor of 20. The proposed sparse adaptation criterion improves three aspects of previous work: It combines ℓ 0 and ℓ 1 penalties, have different adaptation rates for mean and variance parameters and is invariant to affine transformations. © 2012 IEEE.

Conference paper