Score: 2

Disaggregated Design for GPU-Based Volumetric Data Structures

Published: March 10, 2025 | arXiv ID: 2503.07898v2

By: Massimiliano Meneghin, Ahmed H. Mahmoud

BigTech Affiliations: Massachusetts Institute of Technology

Potential Business Impact:

Makes computer simulations run 3 times faster.

Business Areas:

Big Data Data and Analytics

Volumetric data structures typically prioritize data locality, focusing on efficient memory access patterns. This singular focus can neglect other critical performance factors, such as occupancy, communication, and kernel fusion. We introduce a novel \emph{disaggregated} design that rebalances trade-offs between locality and these objectives -- reducing communication overhead on distributed memory architectures, mitigating register pressure in complex boundary conditions, and enabling kernel fusion. We provide a thorough analysis of its benefits on a single-node multi-GPU Lattice Boltzmann Method (LBM) solver. Our evaluation spans dense, block-sparse, and multi-resolution discretizations, demonstrating our design's flexibility and efficiency. Leveraging this approach, we achieve up to a $3\times$ speedup over state-of-the-art solutions.