Conditional validity and a fast approximation formula of full conformal prediction sets
By: Nicolai Amann
Potential Business Impact:
Makes predictions more reliable with less math.
Prediction sets based on full conformal prediction have seen an increasing interest in statistical learning due to their universal marginal coverage guarantees. However, practitioners have refrained from using it in applications for two reasons: Firstly, it comes at very high computational costs, exceeding even that of cross-validation. Secondly, an applicant is typically not interested in a marginal coverage guarantee which averages over all possible (but not available) training data sets, but rather in a guarantee conditional on the specific training data. This work tackles these problems by, firstly, showing that full conformal prediction sets are conditionally conservative given the training data if the conformity score is stochastically bounded and satisfies a stability condition. Secondly, we propose an approximation for the full conformal prediction set that has asymptotically the same training conditional coverage as full conformal prediction under the stability assumption derived before, and can be computed more easily. Furthermore, we show that under the stability assumption, $n$-fold cross-conformal prediction also has the same asymptotic training conditional coverage guarantees as full conformal prediction. If the conformity score is defined as the out-of-sample prediction error, our approximation of the full conformal set coincides with the symmetrized Jackknife. We conclude that for this conformity score, if based on a stable prediction algorithm, full-conformal, $n$-fold cross-conformal, the Jackknife+, our approximation formula, and hence also the Jackknife, all yield the same asymptotic training conditional coverage guarantees.
Similar Papers
Minimum Volume Conformal Sets for Multivariate Regression
Machine Learning (Stat)
Makes computer guesses more honest and useful.
SpeedCP: Fast Kernel-based Conditional Conformal Prediction
Methodology
Makes computer predictions more trustworthy and faster.
Reliable Statistical Guarantees for Conformal Predictors with Small Datasets
Machine Learning (CS)
Makes AI smarter and safer with less data.