Model Accuracy and Data Heterogeneity Shape Uncertainty Quantification in Machine Learning Interatomic Potentials
By: Fei Shuang , Zixiong Wei , Kai Liu and more
Potential Business Impact:
Finds new atomic structures with fewer mistakes.
Machine learning interatomic potentials (MLIPs) enable accurate atomistic modelling, but reliable uncertainty quantification (UQ) remains elusive. In this study, we investigate two UQ strategies, ensemble learning and D-optimality, within the atomic cluster expansion framework. It is revealed that higher model accuracy strengthens the correlation between predicted uncertainties and actual errors and improves novelty detection, with D-optimality yielding more conservative estimates. Both methods deliver well calibrated uncertainties on homogeneous training sets, yet they underpredict errors and exhibit reduced novelty sensitivity on heterogeneous datasets. To address this limitation, we introduce clustering-enhanced local D-optimality, which partitions configuration space into clusters during training and applies D-optimality within each cluster. This approach substantially improves the detection of novel atomic environments in heterogeneous datasets. Our findings clarify the roles of model fidelity and data heterogeneity in UQ performance and provide a practical route to robust active learning and adaptive sampling strategies for MLIP development.
Similar Papers
Surface Stability Modeling with Universal Machine Learning Interatomic Potentials: A Comprehensive Cleavage Energy Benchmarking Study
Materials Science
Predicts how materials break, faster and better.
Uncertainty Quantification for Data-Driven Machine Learning Models in Nuclear Engineering Applications: Where We Are and What Do We Need?
Systems and Control
Shows how sure computers are about their answers.
Comparing the latent features of universal machine-learning interatomic potentials
Chemical Physics
Helps computers understand how atoms connect better.