Scaling IR-system evaluation using term relevance sets

Einat Amitay; David Carmel; Ronny Lempel; Aya Soffer

doi:10.1145/1008992.1008997

SIGIR 2004

Conference paper

25 Jul 2004

Scaling IR-system evaluation using term relevance sets

View publication

Abstract

This paper describes an evaluation method based on Term Relevance Sets (Trels) that measures an IR system's quality by examining the content of the retrieved results rather than by looking for pre-specified relevant pages. Trels consist of a list of terms believed to be relevant for a particular query as well as a list of irrelevant terms. The proposed method does not involve any document relevance judgments, and as such is not adversely affected by changes to the underlying collection. Therefore, it can better scale to very large, dynamic collections such as the Web. Moreover, this method can evaluate a system's effectiveness on an updatable "live" collection, or on collections derived from different data sources. Our experiments show that the proposed method is very highly correlated with official TREC measures.

Paper