Deploying and optimizing performance of a 3D hydrodynamic model on cloud
Abstract
Container-based cloud computing, as standardised and popularised by the open-source docker project has many potential opportunities for scientific application in highperformance computing. It promises highly flexible and available compute capabilities via cloud, without the resource overheads of traditional virtual machines. Further, productivity gains can be made by easy repackaging of images with additional developments, automated deployments, and version-control integrations. Nevertheless, the impact of container overhead and overlay network implementation and performance are areas that requires detailed study to allow for well-defined quality of service for typical HPC applications. This papers presents details on deploying the Environmental Fluid Dynamics Code (EFDC) on a container-based cloud environment. Results are compared to a bare metal deployment. Application-specific benchmarking tests are complemented by detailed network tests that evaluate isolated MPI communication protocols both at intra-node and inter-node level with varying degrees of self-contention. Cloud-based simulations report significant performance loss in mean run-times. A containerised environment increases simulation time by up to 50%. More detailed analysis demonstrates that much of this performance penalty is a result of large variance in MPI communciation times. This manifests as simulation runtime variance on container cloud that hinders both simulation run-time and collection of well-defined quality-of-service metrics.