Optimizing multi-dimensional MPI communications on multi-core architectures
In today's high performance computing, many Message Passing Interface (MPI) programs (e.g., ScaLAPACK applications, High Performance Linpack Benchmark (HPL), and most PDE solvers based on domain decomposition methods) organize their computational processes as multidimensional Cartesian grids. Applications often need to communicate in every dimension of the Cartesian grid. While extensive optimizations have been performed on single dimensional communications such as the standard MPI collective communications, little work has been done to optimize multidimensional communications. We study the ...
(For more, see "View full record.")