FMPE: Discriminatively trained features for speech recognition

Daniel Povey; Brian Kingsbury; Lidia Mangu; George Saon; Hagen Soltau; Geoffrey Zweig

doi:10.1109/ICASSP.2005.1415275

ICASSP 2005

Conference paper

18 Mar 2005

FMPE: Discriminatively trained features for speech recognition

View publication

Abstract

MPE (Minimum Phone Error) is a previously introduced technique for discriminative training of HMM parameters. fMPE applies the same objective function to the features, transforming the data with a kernel-like method and training millions of parameters, comparable to the size of the acoustic model. Despite the large number of parameters, fMPE is robust to over-training. The method is to train a matrix projecting from posteriors of Gaussians to a normal size feature space, and then to add the projected features to normal features such as PLP. The matrix is trained from a zero start using a linear method. Sparsity of posteriors ensures speed in both training and test time. The technique gives similar improvements to MPE (around 10% relative). MPE on top of fMPE results in error rates up to 6.5% relative better than MPE alone, or more if multiple layers of transform are trained. © 2005 IEEE.

Conference paper