Synthetic-Powered Predictive Inference
By: Meshi Bashari , Roy Maor Lotan , Yonghoon Lee and more
Potential Business Impact:
Makes AI more sure about its guesses.
Conformal prediction is a framework for predictive inference with a distribution-free, finite-sample guarantee. However, it tends to provide uninformative prediction sets when calibration data are scarce. This paper introduces Synthetic-powered predictive inference (SPI), a novel framework that incorporates synthetic data -- e.g., from a generative model -- to improve sample efficiency. At the core of our method is a score transporter: an empirical quantile mapping that aligns nonconformity scores from trusted, real data with those from synthetic data. By carefully integrating the score transporter into the calibration process, SPI provably achieves finite-sample coverage guarantees without making any assumptions about the real and synthetic data distributions. When the score distributions are well aligned, SPI yields substantially tighter and more informative prediction sets than standard conformal prediction. Experiments on image classification -- augmenting data with synthetic diffusion-model generated images -- and on tabular regression demonstrate notable improvements in predictive efficiency in data-scarce settings.
Similar Papers
Statistical Inference Leveraging Synthetic Data with Distribution-Free Guarantees
Methodology
Makes AI smarter with fake and real data.
Extending Prediction-Powered Inference through Conformal Prediction
Methodology
Makes computer predictions more trustworthy and private.
Sparse Activations as Conformal Predictors
Machine Learning (CS)
Makes AI guess better by showing possible answers.