DOLMA: A Data Object Level Memory Disaggregation Framework for HPC Applications
By: Haoyu Zheng , Shouwei Gao , Jie Ren and more
Potential Business Impact:
Makes computers use more memory without slowing down.
Memory disaggregation is promising to scale memory capacity and improves utilization in HPC systems. However, the performance overhead of accessing remote memory poses a significant challenge, particularly for compute-intensive HPC applications where execution times are highly sensitive to data locality. In this work, we present DOLMA, a Data Object Level M emory dis Aggregation framework designed for HPC applications. DOLMA intelligently identifies and offloads data objects to remote memory, while providing quantitative analysis to decide a suitable local memory size. Furthermore, DOLMA leverages the predictable memory access patterns typical in HPC applications and enables remote memory prefetch via a dual-buffer design. By carefully balancing local and remote memory usage and maintaining multi-thread concurrency, DOLMA provides a flexible and efficient solution for leveraging disaggregated memory in HPC domains while minimally compromising application performance. Evaluating with eight HPC workloads and computational kernels, DOLMA limits performance degradation to less than 16% while reducing local memory usage by up to 63%, on average.
Similar Papers
ODMA: On-Demand Memory Allocation Framework for LLM Serving on LPDDR-Class Accelerators
Hardware Architecture
Makes AI models run faster on cheaper chips.
A Verified High-Performance Composable Object Library for Remote Direct Memory Access (Extended Version)
Programming Languages
Lets computers share information super fast.
DOPO: A Dynamic PD-Disaggregation Architecture for Maximizing Goodput in LLM Inference Serving
Distributed, Parallel, and Cluster Computing
Makes AI answer questions faster and more reliably.