Score: 0

Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction

Published: April 24, 2025 | arXiv ID: 2504.17671v3

By: Yuanchang Ye, Weiyan Wen

Potential Business Impact:

Stops AI from making up wrong answers.

Business Areas:

Image Recognition Data and Analytics, Software

This study addresses the critical challenge of hallucination mitigation in Large Vision-Language Models (LVLMs) for Visual Question Answering (VQA) tasks through a Split Conformal Prediction (SCP) framework. While LVLMs excel in multi-modal reasoning, their outputs often exhibit hallucinated content with high confidence, posing risks in safety-critical applications. We propose a model-agnostic uncertainty quantification method that integrates dynamic threshold calibration and cross-modal consistency verification. By partitioning data into calibration and test sets, the framework computes nonconformity scores to construct prediction sets with statistical guarantees under user-defined risk levels ($\alpha$). Key innovations include: (1) rigorous control of \textbf{marginal coverage} to ensure empirical error rates remain strictly below $\alpha$; (2) dynamic adjustment of prediction set sizes inversely with $\alpha$, filtering low-confidence outputs; (3) elimination of prior distribution assumptions and retraining requirements. Evaluations on benchmarks (ScienceQA, MMMU) with eight LVLMs demonstrate that SCP enforces theoretical guarantees across all $\alpha$ values. The framework achieves stable performance across varying calibration-to-test split ratios, underscoring its robustness for real-world deployment in healthcare, autonomous systems, and other safety-sensitive domains. This work bridges the gap between theoretical reliability and practical applicability in multi-modal AI systems, offering a scalable solution for hallucination detection and uncertainty-aware decision-making.

Full Conformal Adaptation of Medical Vision-Language Models

CV and Pattern Recognition

Makes AI doctors more sure about diagnoses.

6 Jun 2025 1

90%

Object-Level Verbalized Confidence Calibration in Vision-Language Models via Semantic Perturbation

CV and Pattern Recognition

Makes AI tell you when it's unsure.

21 Apr 2025 0

90%

Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework

Computation and Language

Makes AI answers about health more trustworthy.

7 Mar 2025 0

View PDF Login to Bookmark

Page Count

9 pages

Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction

Stops AI from making up wrong answers.

Technical Abstract

Full Conformal Adaptation of Medical Vision-Language Models

Object-Level Verbalized Confidence Calibration in Vision-Language Models via Semantic Perturbation

Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework