Publication
ISCA 2003
Conference paper

A "flight data recorder" for enabling full-system multiprocessor deterministic replay

View publication

Abstract

Debuggers have been proven indispensable in improving software reliability. Unfortunately, on most real-life software, debuggers fail to deliver their most essential feature - a faithful replay of the execution. The reason is non-determinism caused by multithreading and non-repeatable inputs. A common solution to faithful replay has been to record the non-deterministic execution. Existing recorders, however, either work only for data-race-free programs or have prohibitive overhead. As a step towards powerful debugging, we develop a practical low-overhead hardware recorder for cache-coherent multiprocessors, called Flight Data Recorder (FDR). Like an aircraft flight data recorder, FDR continuously records the execution, even on deployed systems, logging the execution for post-mortem analysis. FDR is practical because it piggybacks on the cache coherence hardware and logs nearly the minimal thread-ordering information necessary to faithfully replay the multiprocessor execution. Our studies, based on simulating a four-processor server with commercial workloads, show that when allocated less than 7% of system's physical memory, our FDR design can capture the last one second of the execution at modest (less than 2%) slowdown.

Date

Publication

ISCA 2003

Authors

Share