Improved Mandarin keyword spotting using confusion garbage model
Shilei Zhang, Zhiwei Shuang, et al.
ICPR 2010
Hierarchical prosody structure generation is a key component for a speech synthesis system. This paper presents a statistic method that predicts the prosody structure for the Chinese text-to-speech (TTS) system by combining a dynamic program method with the rules. The method is based on a manually annotated corpus extracted from the natural speech (IBM Mandarin TTS Corpus for Female 02). The experimental results show that an accuracy of 91.2% for predicting prosodic structure can be achieved. A state-of-the-art Mandarin TTS system is worked out based on the hierarchical prosody structure. Listening tests show that the prosody structure works pretty well.
Shilei Zhang, Zhiwei Shuang, et al.
ICPR 2010
Qin Shi, Kun Li, et al.
INTERSPEECH 2010
Weibin Zhu, Wei Zhang, et al.
IEEE Workshop on Speech Synthesis 2002
Weibin Zhu, Wei Zhang, et al.
ISCSLP 2004