Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025
We show teraflop performance of the fully featured ab initio molecular dynamics code CPMD on an IBM pSeries 690 cluster. A mixed distributed-memory, coarse-grained parallel approach using the MPI library and shared-memory, fine-grained parallelism using OpenMP directives is used to optimally map the algorithms on the available hardware. The top performance achieved is ≈20% of the peak performance and an estimated parallel efficiency of ≈45% on 1024 processors for a system of 1000 atoms. The main limiting factor of parallel efficiency was found to be the latency of the interconnect. © 2005 Elsevier B.V. All rights reserved.
Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025
Michael Muller, Anna Kantosalo, et al.
CHI 2024
Hannah Kim, Celia Cintas, et al.
IJCAI 2023
Paul G. Comba
Journal of the ACM