Machine translation in continuous space
Abstract
We present a different perspective on the machine translation problem that relies upon continuous-space probabilistic models for words and phrases. Within this perspective we propose a method called Tied-Mixture Machine Translation (TMMT) that uses a trainable parametric model employing Gaussian mixture probability density functions to represent word- and phrase-pairs. In the new perspective, machine translation is treated in the same way as acoustic modeling in speech recognition. This new treatment carries several potential advantages that may improve state-of-the-art machine translation systems, including better generalization to unseen events; adaptation to new domains, languages, genres, and speakers via methods such as Maximum-Likelihood Linear Regression (MLLR); and improved discrimination through discriminative training methods such as Maximum Mutual Information Estimation (MMIE). Our goal in this paper, however, is to introduce the new approach and demonstrate its viability, leaving investigation of some of the potential advantages to future work. To this end, we report some preliminary experiments demonstrating the viability of the proposed method. Copyright © 2008 ISCA.