Managing data-movement for effective shared-memory parallelization of out-of-core sparse solvers
Abstract
Direct methods for solving sparse linear systems are robust and typically exhibit good performance, but often require large amounts of memory due to fill-in. Many industrial applications use out-of-core techniques to mitigate this problem. However, parallelizing sparse out-of-core solvers poses some unique challenges because accessing secondary storage introduces serialization and I/O overhead. We analyze the data-movement costs and memory versus parallelism trade-offs in a shared-memory parallel out-of-core linear solver for sparse symmetric systems. We propose an algorithm that uses a novel memory management scheme and adaptive task parallelism to reduce the data-movement costs. We present experiments to show that our solver is faster than existing out-of-core sparse solvers on a single core, and is more scalable than the only other known shared-memory parallel out-of-core solver. This work is also directly applicable at the node level in a distributed-memory parallel scenario. © 2012 IEEE.