Fast speaker adaptive training for speech recognition

Daniel Povey; Hong-Kwang J. Kuo; Hagen Soltau

INTERSPEECH 2008

Conference paper

01 Dec 2008

Fast speaker adaptive training for speech recognition

Abstract

In this paper we describe various fast and convenient implementations of Speaker Adaptive Training (SAT) for use in training when Maximum Likelihood Linear Regression (MLLR) is to be used in test time to adapt Gaussian means. The memory and disk requirements for most of these are similar to those for normal ML training; the computation in all cases is dominated by the need to compute the MLLR transforms. Commonly MLLR is combined with Constrained MLLR (CMLLR) which can be viewed as a feature space affine transform and has its own form of SAT (we will call this CMLLR-SAT); we experiment with combining the two forms of SAT. We find that even on top of CMLLR-SAT, MLLR-SAT gives improvements. Copyright © 2008 ISCA.

Paper