Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Designing and Prototyping Extensions to MPI in MPICH (2402.12274v1)

Published 19 Feb 2024 in cs.DC

Abstract: As HPC system architectures and the applications running on them continue to evolve, the MPI standard itself must evolve. The trend in current and future HPC systems toward powerful nodes with multiple CPU cores and multiple GPU accelerators makes efficient support for hybrid programming critical for applications to achieve high performance. However, the support for hybrid programming in the MPI standard has not kept up with recent trends. The MPICH implementation of MPI provides a platform for implementing and experimenting with new proposals and extensions to fill this gap and to gain valuable experience and feedback before the MPI Forum can consider them for standardization. In this work, we detail six extensions implemented in MPICH to increase MPI interoperability with other runtimes, with a specific focus on heterogeneous architectures. First, the extension to MPI generalized requests lets applications integrate asynchronous tasks into MPI's progress engine. Second, the iovec extension to datatypes lets applications use MPI datatypes as a general-purpose data layout API beyond just MPI communications. Third, a new MPI object, MPIX stream, can be used by applications to identify execution contexts beyond MPI processes, including threads and GPU streams. MPIX stream communicators can be created to make existing MPI functions thread-aware and GPU-aware, thus providing applications with explicit ways to achieve higher performance. Fourth, MPIX Streams are extended to support the enqueue semantics for offloading MPI communications onto a GPU stream context. Fifth, thread communicators allow MPI communicators to be constructed with individual threads, thus providing a new level of interoperability between MPI and on-node runtimes such as OpenMP. Lastly, we present an extension to invoke MPI progress, which lets users spawn progress threads with fine-grained control.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com