Safe Distribution and Parallel Execution of Data-Centric Workflows over the Publish/Subscribe Abstraction
Abstract
In this work, we develop an approach for the safe distribution and parallel execution of data-centric workflows over the publish/subscribe abstraction. In essence, we design a unique representation of data-centric workflows, specifically designed to exploit the loosely coupled and distributed nature of publish/subscribe systems. Furthermore, we argue for the practicality and expressiveness of our approach by mapping a standard and industry-strength data-centric workflow model, namely, IBM Business Artifacts with Guard-Stage-Milestone (GSM), into the publish/subscribe abstraction. In short, the contributions of this work are three-fold: (1) mapping of data-centric workflows into publish/subscribe to achieve distributed and parallel execution; (2) detailed theoretical analysis of the mapping; and (3) formulation of the complexity of the optimal workflow distribution over the publish/subscribe abstraction as an NP-hard problem.