Measuring convergence in language model estimation using relative entropy
Abstract
Language models are generally estimated using smoothed counting techniques. These counting schemes can be viewed as non linear functions operating on a Bernoulli process which converge asymptotically to the true density. The rate at which these counting schemes converge to the true density is constrained by the training data set available and the nature of the language model (LM) being estimated. In this paper we look at language model estimates as random variables and present an efficient relative entropy (R.E) based approach to study their convergence with increasing training data size. We present experimental results for language modeling in a generic LVCSR system and a medical domain dialogue task. We also present an efficient recursive R.E computation method which can be used as a LM distance measure for a number of tasks including LM clustering.