Ensuring Calibration Robustness in Split Conformal Prediction Under Adversarial Attacks
By: Xunlei Qian, Yue Xing
Potential Business Impact:
Makes AI predictions more trustworthy against tricks.
Conformal prediction (CP) provides distribution-free, finite-sample coverage guarantees but critically relies on exchangeability, a condition often violated under distribution shift. We study the robustness of split conformal prediction under adversarial perturbations at test time, focusing on both coverage validity and the resulting prediction set size. Our theoretical analysis characterizes how the strength of adversarial perturbations during calibration affects coverage guarantees under adversarial test conditions. We further examine the impact of adversarial training at the model-training stage. Extensive experiments support our theory: (i) Prediction coverage varies monotonically with the calibration-time attack strength, enabling the use of nonzero calibration-time attack to predictably control coverage under adversarial tests; (ii) target coverage can hold over a range of test-time attacks: with a suitable calibration attack, coverage stays within any chosen tolerance band across a contiguous set of perturbation levels; and (iii) adversarial training at the training stage produces tighter prediction sets that retain high informativeness.
Similar Papers
Split Conformal Classification with Unsupervised Calibration
Machine Learning (Stat)
Lets computers learn from all data, not just some.
Reliable Statistical Guarantees for Conformal Predictors with Small Datasets
Machine Learning (CS)
Makes AI predictions more trustworthy, even with little data.
Domain-Shift-Aware Conformal Prediction for Large Language Models
Machine Learning (Stat)
Makes AI answers more trustworthy and less wrong.