Contextual revision in information seeking conversation systems
Keith Houck
ICSLP 2004
Prosody structure prediction plays an important role in text-to-speech (TTS) conversion systems, where it is a prior step to parametric prosody prediction. Dynamic programming (DP) and decision tree based methods (DT) are widely used for this purpose, but both have well-known limitations. In this paper, we present a combination of both methods, explore the relationship between corpus size and accuracy for three different prediction tasks, and report on the use various lexical features. It is shown that a combination of dynamic programming and decision trees provides the best choice for prosodic word boundary prediction, while decision trees alone give the best results for the prediction of prosodic phrase boundaries. Being originally developed for the Chinese language, we finally demonstrate the transfer of the methods to two different languages, namely Korean and German, where similar results are achieved.
Keith Houck
ICSLP 2004
Bhuvana Ramabhadran, Olivier Siohan, et al.
ICSLP 2004
Shilei Zhang, Zhiwei Shuang, et al.
ICPR 2010
Qin Shi, Kun Li, et al.
INTERSPEECH 2010