Score: 0

Persistent and Partitioned MPI for Stencil Communication

Published: August 18, 2025 | arXiv ID: 2508.13370v1

By: Gerald Collom , Jason Burmark , Olga Pearce and more

Potential Business Impact:

Makes computer programs run much faster.

Many parallel applications rely on iterative stencil operations, whose performance are dominated by communication costs at large scales. Several MPI optimizations, such as persistent and partitioned communication, reduce overheads and improve communication efficiency through amortized setup costs and reduced synchronization of threaded sends. This paper presents the performance of stencil communication in the Comb benchmarking suite when using non blocking, persistent, and partitioned communication routines. The impact of each optimization is analyzed at various scales. Further, the paper presents an analysis of the impact of process count, thread count, and message size on partitioned communication routines. Measured timings show that persistent MPI communication can provide a speedup of up to 37% over the baseline MPI communication, and partitioned MPI communication can provide a speedup of up to 68%.

Country of Origin
🇺🇸 United States

Page Count
7 pages

Category
Computer Science:
Distributed, Parallel, and Cluster Computing