Offloading to CXL-based Computational Memory
By: Suyeon Lee , Kangkyu Park , Kwangsik Shin and more
Potential Business Impact:
Makes computers faster by moving work closer to data.
CXL-based Computational Memory (CCM) enables near-memory processing within expanded remote memory, presenting opportunities to address data movement costs associated with disaggregated memory systems and to accelerate overall performance. However, existing operation offloading mechanisms are not capable of leveraging the trade-offs of different models based on different CXL protocols. This work first examines these tradeoffs and demonstrates their impact on end-to-end performance and system efficiency for workloads with diverse data and processing requirements. We propose a novel 'Asynchronous Back-Streaming' protocol by carefully layering data and control transfer operations on top of the underlying CXL protocols. We design KAI, a system that realizes the asynchronous back-streaming model that supports asynchronous data movement and lightweight pipelining in host-CCM interactions. Overall, KAI reduces end-to-end runtime by up to 50.4%, and CCM and host idle times by average 22.11x and 3.85x, respectively.
Similar Papers
MPI-over-CXL: Enhancing Communication Efficiency in Distributed HPC Systems
Distributed, Parallel, and Cluster Computing
Makes supercomputers share info faster, no copying.
Modeling the Potential of Message-Free Communication via CXL.mem
Distributed, Parallel, and Cluster Computing
Lets computers share memory faster between them.
Enabling Efficient Transaction Processing on CXL-Based Memory Sharing
Hardware Architecture
Makes computer systems process information much faster.