Enabling Efficient Multithreaded MPI Communication Through a Library-Based Implementation of MPI Endpoints



TIME: 11:00AM - 11:30AM

SESSION CHAIR: Ron Brightwell

AUTHOR(S):Srinivas Sridharan, James Dinan, Dhiraj Kalamkar



Modern high-speed interconnection networks are designed with capabilities to support communication from multiple processor cores. The MPI endpoints extension has been proposed to ease process and thread count tradeoffs by enabling multithreaded MPI applications to efficiently drive independent network communication. In this work, we present the first implementation of the MPI endpoints interface and demonstrate the first applications running on this new interface. We use a novel library-based design that can be layered on top of any existing, production MPI implementation. Our approach uses proxy processes to isolate threads in an MPI job, eliminating threading overheads within the MPI library and allowing threads to achieve process-like communication performance. Performance results for the Lattice QCD Dslash kernel indicates that endpoints provides up to 2.9x improvement in communication performance and 1.87x overall performance improvement over a highly optimized hybrid MPI+OpenMP baseline on 128 processors.

Ron Brightwell (Chair) - Sandia National Laboratories

Srinivas Sridharan - Intel Corporation

James Dinan - Intel Corporation

Dhiraj Kalamkar - Intel Corporation

