Score: 0

Statistical Quality and Reproducibility of Pseudorandom Number Generators in Machine Learning technologies

Published: July 2, 2025 | arXiv ID: 2507.03007v1

By: Benjamin A. Antunes

Potential Business Impact:

Makes computer learning more fair and reliable.

Business Areas:

A/B Testing Data and Analytics

Machine learning (ML) frameworks rely heavily on pseudorandom number generators (PRNGs) for tasks such as data shuffling, weight initialization, dropout, and optimization. Yet, the statistical quality and reproducibility of these generators-particularly when integrated into frameworks like PyTorch, TensorFlow, and NumPy-are underexplored. In this paper, we compare the statistical quality of PRNGs used in ML frameworks (Mersenne Twister, PCG, and Philox) against their original C implementations. Using the rigorous TestU01 BigCrush test suite, we evaluate 896 independent random streams for each generator. Our findings challenge claims of statistical robustness, revealing that even generators labeled ''crush-resistant'' (e.g., PCG, Philox) may fail certain statistical tests. Surprisingly, we can observe some differences in failure profiles between the native and framework-integrated versions of the same algorithm, highlighting some implementation differences that may exist.

The influence of the random numbers quality on the results in stochastic simulations and machine learning

Performance

Bad number generators cause wrong computer results.

29 Oct 2025 1

87%

Transformers in Pseudo-Random Number Generation: A Dual Perspective on Theory and Practice

Machine Learning (CS)

Computers learn to make better random numbers.

2 Aug 2025 0

86%

Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability

Machine Learning (CS)

Computers learn secret number patterns better than humans.

30 Oct 2025 1

View PDF Login to Bookmark

Page Count

12 pages

Statistical Quality and Reproducibility of Pseudorandom Number Generators in Machine Learning technologies

Makes computer learning more fair and reliable.

Technical Abstract

The influence of the random numbers quality on the results in stochastic simulations and machine learning

Transformers in Pseudo-Random Number Generation: A Dual Perspective on Theory and Practice

Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability