Offloading to CXL-based Computational Memory
By: Suyeon Lee , Kangkyu Park , Kwangsik Shin and more
Potential Business Impact:
Makes computers faster by moving work closer to data.
CXL-based Computational Memory (CCM) enables near-memory processing within expanded remote memory, presenting opportunities to address data movement costs associated with disaggregated memory systems and to accelerate overall performance. However, existing operation offloading mechanisms are not capable of leveraging the trade-offs of different models based on different CXL protocols. This work first examines these tradeoffs and demonstrates their impact on end-to-end performance and system efficiency for workloads with diverse data and processing requirements. We propose a novel 'Asynchronous Back-Streaming' protocol by carefully layering data and control transfer operations on top of the underlying CXL protocols. We design KAI, a system that realizes the asynchronous back-streaming model that supports asynchronous data movement and lightweight pipelining in host-CCM interactions. Overall, KAI reduces end-to-end runtime by up to 50.4%, and CCM and host idle times by average 22.11x and 3.85x, respectively.
Similar Papers
MPI-over-CXL: Enhancing Communication Efficiency in Distributed HPC Systems
Distributed, Parallel, and Cluster Computing
Makes supercomputers share info faster, no copying.
CXLAimPod: CXL Memory is all you need in AI era
Operating Systems
Makes computers faster with mixed tasks.
Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation
Hardware Architecture
Makes computers share memory faster for better teamwork.