Reducing the variance of point to point transfers in the IBM 9076 parallel computer
Abstract
Commodity workstations have adapted to standard UNIX like environments to allow scientists to efficiently develop and port applications across systems. UNIX based environments, such as IBM's AIX, furnishes such an operating environment while providing efficient uni-processor utilization for user code execution. When these machines are interconnected with a low latency (user space) communication mechanism, large variances in point to point communication times for identical parallel programs are typically found. It is our contention that a large part of this variance is introduced by operating system support functionality that can delay point to point user space communications. We are able to experimentally measure this effect by monitoring the change in time of circulating a token through parallel processors connected in a virtual ring configuration. This paper proposes some solutions and then experimentally validates their ability to reduce point to point message passing variance for the IBM 9076 (SP1) machines.