Ripple: Improved architecture and programming model for bulk synchronous parallel style of analytics

Mike Spreitzer; Malgorzata Steinder; Ian Whalley

doi:10.1109/ICDCS.2013.67

ICDCS 2013

Conference paper

01 Dec 2013

Ripple: Improved architecture and programming model for bulk synchronous parallel style of analytics

View publication

Abstract

We present Ripple, an architecture and a programming model for a broad set of data analytics. Ripple builds on the ideas of iterated MapReduce and adds two innovations. First it has a richer programming model, including more ideas from the Bulk Synchronous Parallel (BSP) model of computation and others. By doing so, Ripple creates a flexible and higher-level platform that is easier for both application programmers and platform implementors. Second, Ripple is based on a limited interface for key/value storage making it portable among many different key/value store implementations. By building on these two ideas Ripple improves the scope, performance, and openness of the data analytics platform. We evaluate Ripple using three representative, and non-trivial, data analysis scenarios requiring iterative computation. Using these examples, we show how Ripple achieves clear performance advantages over iterated MapReduce. © 2013 IEEE.

Conference paper