Distributional phrasal paraphrase generation for statistical machine translation
Abstract
Paraphrase generation has been shown useful for various natural language processing tasks, including statistical machine translation. A commonly used method for paraphrase generation is pivoting [Callison-Burch et al. 2006], which benefits from linguistic knowledge implicit in the sentence alignment of parallel texts, but has limited applicability due to its reliance on parallel texts. Distributional paraphrasing [Marton et al. 2009a] has wider applicability, is more language-independent, but doesn't benefit from any linguistic knowledge. Nevertheless, we show that using distributional paraphrasing can yield greater gains in translation tasks. We report method improvements leading to higher gains than previously published, of almost 2 BLEU points, and provide implementation details, complexity analysis, and further insight into this method. ©2013 ACM.