Abstract
Dynamic program analysis techniques depend on accurate program traces. Program instrumentation is commonly used to collect these traces, which causes overhead to the program execution. Various techniques have addressed this problem by minimizing the number of probes/witnesses used to collect traces. In this paper, we present a novel distributed trace collection framework wherein, a program is executed multiple times with the same input for different sets of witnesses. The partial traces such obtained are then merged to create the whole program trace. Such divide-and-conquer strategy enables parallel collection of partial traces, thereby reducing the total time of collection. The problem is particularly challenging as arbitrary distribution of witnesses cannot guarantee correct formation of traces. We provide and prove a necessary and sufficient condition for distributing the witnesses which ensures correct formation of trace. Moreover, we describe witness distribution strategies that are suitable for parallel collection. We use the framework to collect traces of field SAP-ABAP programs using breakpoints as witnesses as instrumentation cannot be performed due to practical constraints. To optimize such collection, we extend Ball-Larus' optimal edge-based profiling algorithm to an optimal node-based algorithm. We demonstrate the effectiveness of the framework for collecting traces of SAP-ABAP programs. Copyright 2013 ACM.