Automatic taxonomy generation: Issues and possibilities
Raghu Krishnapuram, Krishna Kummamuru
IFSA 2003
Data-mining techniques that detect trends and patterns in structured data are often ill-suited for analysis of unstructured text. Information critical to business - and generated by groups such as employees, customers, and the public - appears in such forms as chats, electronic discussion forums, and blogs. This paper describes techniques developed to detect themes and trends in such informal communication streams. Our approach begins with unsupervised text clustering to create initial categories. A human analyst then refines the categories into easily understandable themes. To facilitate this process, we developed an interactive approach to text category creation and validation that aids the analyst in evaluating each category of a taxonomy and makes it possible to visualize relationships among categories. The resulting analysis can then be communicated to participants in real time. We report on the results of using these techniques in IBM companywide "Jam" events, during which tens of thousands of employees worldwide participated in electronic discussions of key business issues. © Copyright 2006 by International Business Machines Corporation.
Raghu Krishnapuram, Krishna Kummamuru
IFSA 2003
Sabine Deligne, Ellen Eide, et al.
INTERSPEECH - Eurospeech 2001
Khalid Abdulla, Andrew Wirth, et al.
ICIAfS 2014
Yvonne Anne Pignolet, Stefan Schmid, et al.
Discrete Mathematics and Theoretical Computer Science