Publication
ICSLP 2004
Conference paper

Fast SEMI-automatic semantic annotation for spoken dialog systems

Abstract

This paper describes a bootstrapping methodology for semiautomatic semantic annotation of a "mini-corpus" that is conventionally annotated manually to train an initial parser used in natural language understanding (NLU) systems. We propose to cast the problem of semantic annotation as a classification problem: each word is assigned a unique set of semantic tag(s) and/or label(s) from the universal tag/label set. This approach enables "local" semantic annotation resulting in partially annotated sentences. The proposed method reduces the annotation time and cost that forms a major bottleneck in the development of NLU systems. We present a set of experiments conducted on the medical domain "mini-corpus" that contains 10K hand-annotated sentences. Three annotation methods are compared: parser (baseline), similarity and classification-based annotations. The support vector machine (SVM) based classification scheme is shown to outperform both similarity and parsed-based annotation.

Date

Publication

ICSLP 2004

Authors

Share