David W. Jacobs, Daphna Weinshall, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Many applications require matching objects to a predefined, yet highly dynamic set of categories accompanied by category descriptions. We present a novel approach to solving this class of categorization problems by formulating it in a semi-supervised clustering framework. Text-based matching is performed to generate "soft" seeds, which are then used to guide clustering in the basic feature space. We introduce a new variation of the k-means algorithm, called Soft Seeded k-means, which can effectively incorporate seeds that are of varying degrees of confidence, while allowing for incomplete coverage of the pre-defined categories. The algorithm is applied to real-world data from a business analytics application, and we demonstrate that it leads to superior performance compared to previous approaches. © 2008 IEEE.
David W. Jacobs, Daphna Weinshall, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Minerva M. Yeung, Fred Mintzer
ICIP 1997
Graham Mann, Indulis Bernsteins
DIMEA 2007
Fearghal O'Donncha, Albert Akhriev, et al.
Big Data 2021