TME: An knowledge-based information extraction system

Shixia Liu; Liping Yang

ICEIS 2003

Conference paper

23 Apr 2003

TME: An knowledge-based information extraction system

Abstract

Information extraction is a form of shallow text processing that locates a specified set of relevant information in a natural-language document. In this paper, a system-Template Match Engine (TME) is developed to extract useful information from unlabelled texts. The main feature of this system is that it improves and refines the initial extraction pattern by the concept knowledge which is incrementally acquired from the corpus. The system first builds an initial pattern by utilizing domain knowledge. Then the initial pattern is used to extract information from electronic documents. This step produces some feedback words by enlarging and analyzing the extracted information. Next, this pattern is refined by the feedback words and concept knowledge related to them. Finally, the refined pattern is used to extract specified information from electronic documents. The experiment results show that TME system increases recall without loss of precision.

Conference paper