Debugging concurrent processes: A case study
Abstract
We present a case study that illustrates a method of debugging concurrent processes in a parallel programming eminmmen~ It uses a new approach called specularirp rrpkay to reconstruct tbe behavior of a program from the histories of its individual processes. Known time dependences between events in different processes areused to divide the histories into dependence blocks. A graphical npnsentation called a concunrncy mop displays possibiitks for concurrency among p recesses. The replay technique preserves the known depencknces and compares the process histories generated during replay with those that were logged during the original program execution. If a process generates a replay histoxy that does not match its original history, replay backs up. An oltcrnative ordering of events is mated and tested to see if it produces process histories that match the original historks. Successively more controlled replay sequences are generated, by introducing additional dependenecs. We describe ongoing work on tools that will control replay witbout nconstructing the entire space of possible event orderings. The case study presents a miniature exampk of sharedqueue management that can be examined in detail. It demonstrates the replay technique and the construction and use of the coocumncy map. Using our techniques, we detect a failure to which a standard nlgoritiun for aharedqueue management is susceptible.