Sameer Kumar, Philip Heidelberger, et al.
IPDPS 2010
Many systems of great importance in material science, chemistry, solid-state physics, and biophysics require forces generated from an electronic structure calculation, as opposed to an empirically derived force law to describe their properties adequately. The use of such forces as input to Newton's equations of motion forms the basis of the ab initio molecular dynamics method, which is able to treat the dynamics of chemical bond-breaking and -forming events. However, a very large number of electronic structure calculations must be performed to compute an ab initio molecular dynamics trajectory, making the efficiency as well as the accuracy of the electronic structure representation critical issues. One efficient and accurate electronic structure method is the generalized gradient approximation to the Kohn-Sham density functional theory implemented using a plane-wave basis set and atomic pseudopotentials. The marriage of the gradient-corrected density functional approach with molecular dynamics, as pioneered by Car and Parrinello (R. Car and M. Parrinello, Phys Rev Lett 1985, 55, 2471), has been demonstrated to be capable of elucidating the atomic scale structure and dynamics underlying many complex systems at finite temperature. However, despite the relative efficiency of this approach, it has not been possible to obtain parallel scaling of the technique beyond several hundred processors on moderately sized systems using standard approaches. Consequently, the time scales that can be accessed and the degree of phase space sampling are severely limited. To take advantage of next generation computer platforms with thousands of processors such as IBM's BlueGene, a novel scalable parallelization strategy for Car-Parrinello molecular dynamics is developed using the concept of processor visualization as embodied by the Charm+ + parallel programming system. Charm+ + allows the diverse elements of a Car-Parrinello molecular dynamics calculation to be interleaved with low latency such that unprecedented scaling is achieved. As a benchmark, a system of 32 water molecules, a common system size employed in the study of the aqueous solvation and chemistry of small molecules, is shown to scale on more than 1500 processors, which is impossible to achieve using standard approaches. This degree of parallel scaling is expected to open new opportunities for scientific inquiry. © 2004 Wiley Periodicals, Inc.
Sameer Kumar, Philip Heidelberger, et al.
IPDPS 2010
Gheorghe Almasi, Sameh Asaad, et al.
IBM J. Res. Dev
Sameer Kumar, Yogish Sabharwal, et al.
ICPP 2008
Ahmad Faraj, Sameer Kumar, et al.
HOTI 2009