Score: 0

Modeling the Potential of Message-Free Communication via CXL.mem

Published: December 8, 2025 | arXiv ID: 2512.08005v1

By: Stepan Vanecek , Matthew Turner , Manisha Gajbe and more

Potential Business Impact:

Lets computers share memory faster between them.

Business Areas:
Cloud Computing Internet Services, Software

Heterogeneous memory technologies are increasingly important instruments in addressing the memory wall in HPC systems. While most are deployed in single node setups, CXL.mem is a technology that implements memories that can be attached to multiple nodes simultaneously, enabling shared memory pooling. This opens new possibilities, particularly for efficient inter-node communication. In this paper, we present a novel performance evaluation toolchain combined with an extended performance model for message-based communication, which can be used to predict potential performance benefits from using CXL.mem for data exchange. Our approach analyzes data access patterns of MPI applications: it analyzes on-node accesses to/from MPI buffers, as well as cross-node MPI traffic to gather a full understanding of the impact of memory performance. We combine this data in an extended performance model to predict which data transfers could benefit from direct CXL.mem implementations as compared to traditional MPI messages. Our model works on a per-MPI call granularity, allowing the identification and later optimizations of those MPI invocations in the code with the highest potential for speedup by using CXL.mem. For our toolchain, we extend the memory trace sampling tool Mitos and use it to extract data access behavior. In the post-processing step, the raw data is automatically analyzed to provide performance models for each individual MPI call. We validate the models on two sample applications -- a 2D heat transfer miniapp and the HPCG benchmark -- and use them to demonstrate their support for targeted optimizations by integrating CXL.mem.

Country of Origin
🇩🇪 Germany

Page Count
14 pages

Category
Computer Science:
Distributed, Parallel, and Cluster Computing