Publication
ICASSP 2005
Conference paper

Language model estimation for optimizing end-to-end performance of a natural language call routing system

View publication

Abstract

Conventional methods for training statistical models for automatic speech recognition, such as acoustic and language models, have focused on criteria such as maximum likelihood and sentence or word error rate (WER). However, unlike dictation systems, the goal for spoken dialogue systems is to understand the meaning of what a person says, not to get every word correctly transcribed. For such systems, we propose to optimize the statistical models under end-to-end system performance criteria. We illustrate this principle by focusing on the estimation of the language model (LM) component of a natural language call routing system. This estimation, carried out under a conditional maximum likelihood objective, aims at optimizing the call routing (classification) accuracy, which is often the criterion of interest in these systems. LM updates are derived using the extended Baum-Welch procedure of Gopalakrishnan et.al. In our experiments, we find that our estimation procedure leads to a small but promising gain in classification accuracy. Interestingly, the estimated language models also lead to an increase in the word error rate while improving the classification accuracy, showing that the system with the best classification accuracy is not necessarily the one with the lowest WER. Significantly, our LM estimation procedure does not require the correct transcription of the training data, and can therefore be applied to unsupervised learning from un-transcribed speech data. © 2005 IEEE.

Date

Publication

ICASSP 2005

Authors

Share