The International Conference for High Performance Computing, Networking, Storage and Analysis
Lessons From Analyzing Fan-In Communications.
Authors: Terry Jones (Oak Ridge National Laboratory), Bradley Settlemyer (Oak Ridge National Laboratory)
Abstract: We present a study of an important class of communication operations––the fan-in communication pattern. By its nature, fan-in communications form ‘hot spots’ that present significant challenges for any interconnect fabric and communication software stack. Yet despite the inherent challenges, these communication patterns are common in both applications (which often perform reductions and other collective operations that include fan-in communication such as barriers) and system software (where they assume an important role within parallel file systems and other components requiring high-bandwidth or low-latency I/O). Our study determines the effectiveness of differing client-server fan-in strategies. We describe fan-in performance in terms of aggregate bandwidth in the presence of varying degrees of congestion, as well as several other key attributes. Results are presented for a large Cray Aries-interconnect based super-computer and a large Gemini-interconnect based super-computer. Finally, we provide recommended communication strategies based on our findings.