Score: 1

Synthetic-Powered Predictive Inference

Published: May 19, 2025 | arXiv ID: 2505.13432v2

By: Meshi Bashari , Roy Maor Lotan , Yonghoon Lee and more

Potential Business Impact:

Makes AI more sure about its guesses.

Business Areas:
Predictive Analytics Artificial Intelligence, Data and Analytics, Software

Conformal prediction is a framework for predictive inference with a distribution-free, finite-sample guarantee. However, it tends to provide uninformative prediction sets when calibration data are scarce. This paper introduces Synthetic-powered predictive inference (SPI), a novel framework that incorporates synthetic data -- e.g., from a generative model -- to improve sample efficiency. At the core of our method is a score transporter: an empirical quantile mapping that aligns nonconformity scores from trusted, real data with those from synthetic data. By carefully integrating the score transporter into the calibration process, SPI provably achieves finite-sample coverage guarantees without making any assumptions about the real and synthetic data distributions. When the score distributions are well aligned, SPI yields substantially tighter and more informative prediction sets than standard conformal prediction. Experiments on image classification -- augmenting data with synthetic diffusion-model generated images -- and on tabular regression demonstrate notable improvements in predictive efficiency in data-scarce settings.

Repos / Data Links

Page Count
48 pages

Category
Computer Science:
Machine Learning (CS)