About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
CIKM 2015
Conference paper
Lifespan-based partitioning of index structures for time-travel text search
Abstract
Time-travel text search over a temporally evolving document collection is useful in various applications. Supporting a wide range of query classes demanded by these applications require different index layouts optimized for their respective query access patterns. The problem we tackle is how to efficiently handle different query classes using the same index layout. Our approach is to use list intersections on single-attribute indexes of keywords and temporal attributes. Although joint predicate evaluation on single-attribute indexes is inefficient in general, we show that partitioning the index based on version lifespans coupled with exploiting the transaction-time ordering of record-identifiers, can significantly reduce the cost of list intersections. We empirically evaluate different index partitioning alternatives on top of open-source Lucene, and show that our approach is the only technique that can simultaneously support a wide range of query classes efficiently, have high indexing throughput in a real-time ingestion setting, and also have negligible extra storage costs.