Hieroglyph: Locally-Sufficient Graph Processing via Compute-Sync-Merge
Abstract
Mainstream graph processing systems (such as Pregel [3] and PowerGraph [1]) follow the bulk synchronous parallel model. This design leads to the tight coupling of computation and communication, where no vertex can proceed to the next iteration of computation until all vertices have been processed in the current iteration and graph states have been synchronized across all hosts. This coupling of computation and communication incurs significant performance penalty. Fully decoupling computation from communication requires (i) restricted access to only local state during computation and (ii) independence of inter-host communication from computation. We call the combination of both conditions local sufficiency. Local sufficiency is not efficiently supported by state of the art. Synchronous systems, by design, do not support local sufficiency due to their intrinsic computation-communication coupling. Even systems that implement asynchronous execution only partially achieve local sufficiency. For example, PowerGraph's asynchronous mode satisfies local sufficiency by distributed scheduling.