From Euler to Today: Universal Mathematical Fallibility A Large-Scale Computational Analysis of Errors in ArXiv Papers
By: Igor Rivin
Potential Business Impact:
Finds math mistakes in old and new papers.
We present the results of a large-scale computational analysis of mathematical papers from the ArXiv repository, demonstrating a comprehensive system that not only detects mathematical errors but provides complete referee reports with journal tier recommendations. Our automated analysis system processed over 37,000 papers across multiple mathematical categories, revealing significant error rates and quality distributions. Remarkably, the system identified errors in papers spanning three centuries of mathematics, including works by Leonhard Euler (1707-1783) and Peter Gustav Lejeune Dirichlet (1805-1859), as well as contemporary Fields medalists. In Numerical Analysis (math.NA), we observed an error rate of 9.6\% (2,271 errors in 23,761 papers), while Geometric Topology (math.GT) showed 6.5\% (862 errors in 13,209 papers). Strikingly, Category Theory (math.CT) showed 0\% errors in 93 papers analyzed, with evidence suggesting these results are ``easier'' for automated analysis. Beyond error detection, the system evaluated papers for journal suitability, recommending 0.4\% for top generalist journals, 15.5\% for top field-specific journals, and categorizing the remainder across specialist venues. These findings demonstrate both the universality of mathematical error across all eras and the feasibility of automated comprehensive mathematical peer review at scale. This work demonstrates that the methodology, while applied here to mathematics, is discipline-agnostic and could be readily extended to physics, computer science, and other fields represented in the ArXiv repository.
Similar Papers
To Err Is Human: Systematic Quantification of Errors in Published AI Papers via LLM Analysis
Artificial Intelligence
Finds and fixes errors in AI research papers.
A Case for a "Refutations and Critiques'' Track in Statistics Journals
Methodology
Fixes bad science papers with new review system.
FLAWS: A Benchmark for Error Identification and Localization in Scientific Papers
Computation and Language
Helps computers find mistakes in science papers.