HPL-MxP Benchmark: Mixed-Precision Algorithms, Iterative Refinement, and Scalable Data Generation
By: Jack Dongarra, Piotr Luszczek
Potential Business Impact:
Makes supercomputers run faster for AI.
We present a mixed-precision benchmark called HPL-MxP that uses both a lower-precision LU factorization with a non-stationary iterative refinement based on GMRES. We evaluate the numerical stability of one of the methods of generating the input matrix in a scalable fashion and show how the diagonal scaling affects the solution quality in terms of the backward-error. Some of the performance results at large scale supercomputing installations produced Exascale-level compute throughput numbers thus proving the viability of the proposed benchmark for evaluating such machines. We also present the potential of the benchmark to continue increasing its use with proliferation of hardware accelerators for AI workloads whose reliable evaluation continues to pose a particular challenge for the users.
Similar Papers
Scaling the memory wall using mixed-precision -- HPG-MxP on an exascale machine
Distributed, Parallel, and Cluster Computing
Makes supercomputers run science problems 1.6x faster.
Hardware Acceleration for HPS Algorithms in Two and Three Dimensions
Numerical Analysis
Makes computer math problems solve much faster.
Leveraging Hardware-Aware Computation in Mixed-Precision Matrix Multiply: A Tile-Centric Approach
Distributed, Parallel, and Cluster Computing
Makes computers solve problems faster and use less power.