Score: 0

Uncertainty-Aware Data-Efficient AI: An Information-Theoretic Perspective

Published: December 4, 2025 | arXiv ID: 2512.05267v1

By: Osvaldo Simeone, Yaniv Romano

In context-specific applications such as robotics, telecommunications, and healthcare, artificial intelligence systems often face the challenge of limited training data. This scarcity introduces epistemic uncertainty, i.e., reducible uncertainty stemming from incomplete knowledge of the underlying data distribution, which fundamentally limits predictive performance. This review paper examines formal methodologies that address data-limited regimes through two complementary approaches: quantifying epistemic uncertainty and mitigating data scarcity via synthetic data augmentation. We begin by reviewing generalized Bayesian learning frameworks that characterize epistemic uncertainty through generalized posteriors in the model parameter space, as well as ``post-Bayes'' learning frameworks. We continue by presenting information-theoretic generalization bounds that formalize the relationship between training data quantity and predictive uncertainty, providing a theoretical justification for generalized Bayesian learning. Moving beyond methods with asymptotic statistical validity, we survey uncertainty quantification methods that provide finite-sample statistical guarantees, including conformal prediction and conformal risk control. Finally, we examine recent advances in data efficiency by combining limited labeled data with abundant model predictions or synthetic data. Throughout, we take an information-theoretic perspective, highlighting the role of information measures in quantifying the impact of data scarcity.

Epistemic Artificial Intelligence is Essential for Machine Learning Models to Truly 'Know When They Do Not Know'

Artificial Intelligence

AI learns to admit when it doesn't know.

8 May 2025 1

90%

A Theory of the Mechanics of Information: Generalization Through Measurement of Uncertainty (Learning is Measuring)

Machine Learning (CS)

Makes computers learn from messy data easily.

26 Oct 2025 2

89%

Tutorial on the Probabilistic Unification of Estimation Theory, Machine Learning, and Generative AI

Machine Learning (CS)

Helps computers learn from messy, unclear information.

21 Aug 2025 0

View PDF Login to Bookmark

Uncertainty-Aware Data-Efficient AI: An Information-Theoretic Perspective

Technical Abstract

Epistemic Artificial Intelligence is Essential for Machine Learning Models to Truly 'Know When They Do Not Know'

A Theory of the Mechanics of Information: Generalization Through Measurement of Uncertainty (Learning is Measuring)

Tutorial on the Probabilistic Unification of Estimation Theory, Machine Learning, and Generative AI