Distributed graph query processing in dynamic networks
Abstract
In this paper we examine a popular network computational model (BSP: Bulk Synchronous Parallel) that has been adopted by the Google Pregel system to support large scale graph processing. We show that the synchronicity assumption made by the BSP model, while acceptable in data center like environments with strong and persistent network connectivity, can result in severe performance penalties in the context of dynamic networks. We introduce a new computational model (BAP: Bulk Asynchronous Parallel) that preserves the bulk and parallel nature of the BSP model but extends the model to asynchronous network communication. We consider two popular classes of graph queries (random walk queries and shortest path queries), present both BSP and BAP algorithms for these queries and evaluate their performance using realistic graphs datasets (DBLP and Flickr) and dynamic network datasets (Infocom06 and MIT Reality dataset). Our initial results show that in dynamic networks BAP algorithms can achieve several orders of magnitude in improvement for various QoI metrics such as accuracy and latency of (partial and complete) query evaluation. © 2012 IEEE.