Domain-Aware Dependency Parsing for Questions
Abstract
Parsing natural language questions in specific domains is crucial to a wide range of applications from question-answering to dialog systems. Pre-trained parsers are usually trained on corpora dominated by non-questions, and thus perform poorly on domain-specific questions. Retraining parsers with domain-specific questions labeled with syntactic parse trees is expensive, as these annotations require linguistic expertise. In this paper, we propose an automatic labeled domain question generation framework by leveraging domain knowledge and seed domain questions. We evaluate our approach in two domains, and release the generated question datasets. Our experimental results demonstrate that auto-generated labeled questions indeed lead to significant (4.9% − 9%) increase in the accuracy of state-of-the-art (SoTA) parsers on domain questions.