Score: 0

Scaling the memory wall using mixed-precision -- HPG-MxP on an exascale machine

Published: July 15, 2025 | arXiv ID: 2507.11512v1

By: Aditya Kashi , Nicholson Koukpaizan , Hao Lu and more

Potential Business Impact:

Makes supercomputers run science problems 1.6x faster.

Mixed-precision algorithms have been proposed as a way for scientific computing to benefit from some of the gains seen for artificial intelligence (AI) on recent high performance computing (HPC) platforms. A few applications dominated by dense matrix operations have seen substantial speedups by utilizing low precision formats such as FP16. However, a majority of scientific simulation applications are memory bandwidth limited. Beyond preliminary studies, the practical gain from using mixed-precision algorithms on a given HPC system is largely unclear. The High Performance GMRES Mixed Precision (HPG-MxP) benchmark has been proposed to measure the useful performance of a HPC system on sparse matrix-based mixed-precision applications. In this work, we present a highly optimized implementation of the HPG-MxP benchmark for an exascale system and describe our algorithm enhancements. We show for the first time a speedup of 1.6x using a combination of double- and single-precision on modern GPU-based supercomputers.

HPL-MxP Benchmark: Mixed-Precision Algorithms, Iterative Refinement, and Scalable Data Generation

Numerical Analysis

Makes supercomputers run faster for AI.

23 Sep 2025 0

90%

Leveraging Hardware-Aware Computation in Mixed-Precision Matrix Multiply: A Tile-Centric Approach

Distributed, Parallel, and Cluster Computing

Makes computers solve problems faster and use less power.

20 Aug 2025 0

88%

Enabling mixed-precision in spectral element codes

Mathematical Software

Makes supercomputers run faster and use less power.

3 Mar 2025 1

View PDF Login to Bookmark

Page Count

12 pages

Scaling the memory wall using mixed-precision -- HPG-MxP on an exascale machine

Makes supercomputers run science problems 1.6x faster.

Technical Abstract

HPL-MxP Benchmark: Mixed-Precision Algorithms, Iterative Refinement, and Scalable Data Generation

Leveraging Hardware-Aware Computation in Mixed-Precision Matrix Multiply: A Tile-Centric Approach

Enabling mixed-precision in spectral element codes