Publication
SIGMOD Record
Paper
A relational framework for information extraction
Abstract
Information Extraction commonly refers to the task of populating a relational schema, having predefined underlying semantics, from textual content. This task is pervasive in contemporary computational challenges associated with Big Data. In this article we provide an overview of our work on document spanners-a relational framework for Information Extraction that is inspired by rule-based systems such as IBM's SystemT.