Conference paper

PARSIM: A parallel trace-driven simulation facility for fast and accurate performance analysis studies


One of the major impediments to pre-silicon performance analysis is the ever-increasing sizes of real workloads. This problem makes the use of trace-based simulation methods impractical in time-bound processor development projects. In this paper, we describe a simple method of speeding up trace-driven architectural simulation tools through the use of parallel processing. The PARSIM facility allows the processor performance team to accelerate their existing trace-driven simulation methodology, without having to modify the original trace generation and simulation tools. In achieving speed-up, it is important to ensure that there is no significant loss of accuracy, when compared to runs made on a uniprocessor workstation. PARSIM allows the user to retain accuracy, by automatically adding cache state warm-up preambles for each parallel trace chunk. It also offers built-in options to choose samples from each parallel trace chunk. PARSIM is currently implemented to work on an IBM SP-2 system. We report experimental results for selected benchmark workloads to demonstrate the practical use of this facility.
