A two-tier index architecture for fast processing large RDF data over distributed memory

Long Cheng; Spyros Kotoulas; Tomas E. Ward; Georgios Theodoropoulos

doi:10.1145/2631775.2631789

HT 2014

Conference paper

01 Sep 2014

A two-tier index architecture for fast processing large RDF data over distributed memory

View publication

Abstract

We propose an efficient method for fast processing large RDF data over distributed memory. Our approach adopts a two-tier index architecture on each computation node: (1) a light-weight primary index, to keep loading times low, and (2) a dynamic, multi-level secondary index, calculated as a by-product of query execution, to decrease or remove inter-machine data movement for subsequent queries that contain the same graph patterns. Experimental results on a commodity cluster show that we can load large RDF data very quickly in memory while remaining within an interactive range for query processing with the secondary index. © 2014 Authors.

Conference paper