Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts
By: Yeming Xian, Xiaoming Wang, Yanfa Yan
Potential Business Impact:
Finds better ways to make oxygen from water.
Understanding and predicting the activity of oxide perovskite catalysts for the oxygen evolution reaction (OER) requires descriptors that are both accurate and physically interpretable. While symbolic regression (SR) offers a path to discover such formulas, its performance degrades with high-dimensional inputs and small datasets. We present a two-phase framework that combines neural networks (NN), feature importance analysis, and symbolic regression (SR) to discover interpretable descriptors for OER activity in oxide perovskites. In Phase I, using a small dataset and seven structural features, we reproduce and improve the known {\mu}/t descriptor by engineering composite features and applying symbolic regression, achieving training and validation MAEs of 22.8 and 20.8 meV, respectively. In Phase II, we expand to 164 features, reduce dimensionality, and identify LUMO energy as a key electronic descriptor. A final formula using {\mu}/t, {\mu}/RA, and LUMO energy achieves improved accuracy (training and validation MAEs of 22.1 and 20.6 meV) with strong physical interpretability. Our results demonstrate that NN-guided symbolic regression enables accurate, interpretable, and physically meaningful descriptor discovery in data-scarce regimes, indicating interpretability need not sacrifice accuracy for materials informatics.
Similar Papers
SA-GAT-SR: Self-Adaptable Graph Attention Networks with Symbolic Regression for high-fidelity material property prediction
Computational Physics
Finds new materials by understanding how they work.
Symbolic regression for defect interactions in 2D materials
Machine Learning (CS)
Finds simple math rules for materials.
Decomposable Neuro Symbolic Regression
Machine Learning (CS)
Finds simple math rules for complex data.