Score: 2

Enhancing Symbolic Regression with Quality-Diversity and Physics-Inspired Constraints

Published: March 24, 2025 | arXiv ID: 2503.19043v1

By: J. -P. Bruneton

Potential Business Impact:

Finds hidden math rules in data.

Business Areas:
Quantum Computing Science and Engineering

This paper presents QDSR, an advanced symbolic Regression (SR) system that integrates genetic programming (GP), a quality-diversity (QD) algorithm, and a dimensional analysis (DA) engine. Our method focuses on exact symbolic recovery of known expressions from datasets, with a particular emphasis on the Feynman-AI benchmark. On this widely used collection of 117 physics equations, QDSR achieves an exact recovery rate of 91.6~$\%$, surpassing all previous SR methods by over 20 percentage points. Our method also exhibits strong robustness to noise. Beyond QD and DA, this high success rate results from a profitable trade-off between vocabulary expressiveness and search space size: we show that significantly expanding the vocabulary with precomputed meaningful variables (e.g., dimensionless combinations and well-chosen scalar products) often reduces equation complexity, ultimately leading to better performance. Ablation studies will also show that QD alone already outperforms the state-of-the-art. This suggests that a simple integration of QD, by projecting individuals onto a QD grid, can significantly boost performance in existing algorithms, without requiring major system overhauls.

Repos / Data Links

Page Count
23 pages

Category
Computer Science:
Neural and Evolutionary Computing