Score: 0

MMA-Sim: Bit-Accurate Reference Model of Tensor Cores and Matrix Cores

Published: November 14, 2025 | arXiv ID: 2511.10909v1

By: Peichen Xie , Yang Wang , Fan Yang and more

Potential Business Impact:

Fixes computer math errors in AI programs.

Business Areas:

Simulation Software

The rapidly growing computation demands of deep neural networks (DNNs) have driven hardware vendors to integrate matrix multiplication accelerators (MMAs), such as NVIDIA Tensor Cores and AMD Matrix Cores, into modern GPUs. However, due to distinct and undocumented arithmetic specifications for floating-point matrix multiplication, some MMAs can lead to numerical imprecision and inconsistency that can compromise the stability and reproducibility of DNN training and inference. This paper presents MMA-Sim, the first bit-accurate reference model that reveals the detailed arithmetic behaviors of the MMAs from ten GPU architectures (eight from NVIDIA and two from AMD). By dissecting the MMAs using a combination of targeted and randomized tests, our methodology derives nine arithmetic algorithms to simulate the floating-point matrix multiplication of the MMAs. Large-scale validation confirms bitwise equivalence between MMA-Sim and the real hardware. Using MMA-Sim, we investigate arithmetic behaviors that affect DNN training stability, and identify undocumented behaviors that could lead to significant errors.

Accurate Models of NVIDIA Tensor Cores

Mathematical Software

Makes computer math results the same everywhere.

7 Dec 2025 0

87%

A Configurable Mixed-Precision Fused Dot Product Unit for GPGPU Tensor Computation

Hardware Architecture

Speeds up AI learning by combining math types.

19 Nov 2025 0

87%

Leveraging Hardware-Aware Computation in Mixed-Precision Matrix Multiply: A Tile-Centric Approach

Distributed, Parallel, and Cluster Computing

Makes computers solve problems faster and use less power.

20 Aug 2025 0

View PDF Login to Bookmark

Page Count

12 pages

MMA-Sim: Bit-Accurate Reference Model of Tensor Cores and Matrix Cores

Fixes computer math errors in AI programs.

Technical Abstract

Accurate Models of NVIDIA Tensor Cores

A Configurable Mixed-Precision Fused Dot Product Unit for GPGPU Tensor Computation

Leveraging Hardware-Aware Computation in Mixed-Precision Matrix Multiply: A Tile-Centric Approach