SORS: Optimizing Collective Communications on Hierarchical Networks
Objectives
Abstract: Today, communication networks on high-performance computing (HPC) systems emphasize fat node design with multiple GPUs and network interface cards (NICs) per node. According to this trend, intra-node and extra-node network architectures form increasingly complex and diverse hierarchies. The challenge for application and library developers is understanding the network’s---often non-obvious---characteristics for tailored optimizations. As a remedy, we developed a configurable benchmarking tool (CommBench) that offers an intuitive API for composing a desired communication pattern and taking measurements using MPI, NCCL, or IPC with no additional effort. This talk will demonstrate the performance implications exposed by CommBench toward designing generalized hierarchical optimizations. To conclude, we will discuss optimizations with multi-NIC striping of non-uniform hierarchical trees and pipelining communications across multiple levels of the network hierarchy.
Short bio: Mert Hidayetoglu is a postdoctoral scholar at Stanford University, where he works with Alex Aiken at SLAC National Accelerator Laboratory. He received his Ph.D. from University of Illinois at Urbana-Champaign under the supervision of Wen-mei Hwu. His thesis is on optimizing sparse computations and communications for solving large-scale inverse problems using supercomputers. He received the SC20 best paper award and ACM/IEEE-CS George Michael Memorial HPC Fellowship in 2021.
Speakers
Host: Toni Peña, Accelerators for High Performance Computing Group Manager, CS, BSC