Accelerating MPI Collective Communications through Hierarchical Algorithms with Flexible Inter-Node Communication and Imbalance Awareness

SESSION: Doctoral Showcase - Dissertation Research I

EVENT TYPE: Doctoral Showcase

TIME: 11:45AM - 12:00PM

SESSION CHAIR: Karen L. Karavanic

Presenter(s):Benjamin Parsons



This work investigates collective communication algorithms on a shared memory system, and develops the universal hierarchical algorithm. This algorithm can pair arbitrary hierarchy unaware inter-node communication algorithms with shared memory intra-node communication. In addition to flexible inter-node communication, this algorithm works with all collectives, including those incompatible with past works, like alltoallv. The universal algorithm shows impressive performance results, improving upon the MPICH algorithms as well as the Cray MPT algorithms. Speedups average 15x - 30x for most collectives with improved scalability up to 64k cores.

The second part of this work creates new hierarchical collective algorithms designed to tolerate process imbalance. The process imbalance of benchmarks is thoroughly evaluated, and is used to design collective algorithms that minimize the synchronization delay observed by early arriving processes. Preliminary results for a reduction show speed-ups reaching 47x over a binomial tree algorithm in the presence of high, but not unreasonable, imbalance.

Chair/Presenter Details:

Karen L. Karavanic (Chair) - Portland State University

Benjamin Parsons - Purdue University

