L.R. Bahl, S. Balakrishnan-Aiyer, et al.
ICASSP 1995
Previous work addressing the issue of word distribution in documents has shown the importance of word repetitiveness as an indicator of the word content-bearing characteristics. In this paper we propose a simple method using a measure of the tendency of words to repeat within a document to separate the words with similar document frequencies, but different topic discriminating characteristics. We describe the application of the new measure in query-document relevance scoring. Experiments on the TREC Ad Hoc and Spoken Document Retrieval tasks show useful performance improvements.
L.R. Bahl, S. Balakrishnan-Aiyer, et al.
ICASSP 1995
G. Iyengar, H.J. Nock, et al.
ICME 2002
S. Dharanipragada, Martin Franz, et al.
ICSLP 2000
S. Dharanipragada, Martin Franz, et al.
INTERSPEECH - Eurospeech 1999