Publication
ICSLP 2002
Conference paper
Maximum entropy model for punctuation annotation from speech
Abstract
In this paper we develop a maximum-entropy based method for annotating spontaneous conversational speech with punctuation. The goal of this task is to make automatic transcriptions more readable by humans, and to render them into a form that is useful for subsequent natural language processing and discourse analysis. Our basic approach is to view the insertion of punctuation as a form of tagging, in which words are tagged with appropriate punctuation, and to apply a maximum entropy tagger that uses both lexical and prosodic features. We present experimental results on Switchboard data with both reference transcriptions and transcriptions produced by a speech recognition system.