Minimum Bayes risk discriminative language models for Arabic speech recognition

Hong-Kwang Jeff Kuo; Ebru Arisoy; Lidia Mangu; George Saon

doi:10.1109/ASRU.2011.6163932

ASRU 2011

Conference paper

01 Dec 2011

Minimum Bayes risk discriminative language models for Arabic speech recognition

View publication

Abstract

In this paper we explore discriminative language modeling (DLM) on highly optimized state-of-the-art large vocabulary Arabic broadcast speech recognition systems used for the Phase 5 DARPA GALE Evaluation. In particular, we study in detail a minimum Bayes risk (MBR) criterion for DLM. MBR training outperforms perceptron training. Interestingly, we found that our DLMs generalized to mismatched conditions, such as using a different acoustic model during testing. We also examine the interesting problem of unsupervised DLM training using a Bayes risk metric as a surrogate for word error rate (WER). In some experiments, we were able to obtain about half of the gain of the supervised DLM. © 2011 IEEE.

Conference paper