About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SUSCOM
Paper
Systematic derivation of time and power models for linear algebra kernels on multicore architectures
Abstract
The power wall asks for a holistic effort from the high performance and scientific communities to develop power-aware tools and applications which ultimately drive the design of energy-efficient hardware. Toward this goal, we introduce a systematic methodology to derive reliable time and power models for algebraic kernels employing a bottom-up approach. This strategy helps to understand the contribution of the different kernels to the total energy consumption of applications, as well as to distinguish between the cost of fine-grain components such as arithmetic, memory access, and overheads introduced by, e.g., multithreading or reductions. To study and validate our methodology, we initially focus on two key memory-bound BLAS-1 vector kernels: the dot product and the axpy operation. Subsequently, we show how these kernels can be composed to accurately predict the energy consumption of more heterogeneous algorithms, such as the Conjugate Gradient method, while tackling the elaborate memory hierarchy and the high degree of concurrency of today's processors; in particular, the evaluation of the models on the IBM® Blue Gene/Q supercomputer, as well as on the IBM® Power 755 server, reveals that average power consumption is captured at high accuracy, yet the models and the methodology are universal to be portable to any general-purpose multicore architecture.