Credible Uncertainty Quantification under Noise and System Model Mismatch
By: Penggao Yan, Li-Ta Hsu
Potential Business Impact:
Makes computer guesses more trustworthy and reliable.
State estimators often provide self-assessed uncertainty metrics, such as covariance matrices, whose reliability is critical for downstream tasks. However, these self-assessments can be misleading due to underlying modeling violations like noise or system model mismatch. This letter addresses the problem of estimator credibility by introducing a unified, multi-metric evaluation framework. We construct a compact credibility portfolio that synergistically combines traditional metrics like the Normalized Estimation Error Squared (NEES) and the Noncredibility Index (NCI) with proper scoring rules, namely the Negative Log-Likelihood (NLL) and the Energy Score (ES). Our key contributions are a novel energy distance-based location test to robustly detect system model misspecification and a method that leverages the asymmetric sensitivities of NLL and ES to distinguish optimism covariance scaling from system bias. Monte Carlo simulations across six distinct credibility scenarios demonstrate that our proposed method achieves high classification accuracy (80-100%), drastically outperforming single-metric baselines which consistently fail to provide a complete and correct diagnosis. This framework provides a practical tool for turning patterns of credibility indicators into actionable diagnoses of model deficiencies.
Similar Papers
Score-Based Training for Energy-Based TTS Models
Sound
Teaches computers to learn better from messy data.
Causal Judge Evaluation: Calibrated Surrogate Metrics for LLM Systems
Methodology
Makes AI judges more fair and cheaper.
Systematic Evaluation of Uncertainty Estimation Methods in Large Language Models
Computation and Language
Helps computers know when they are wrong.