Score: 0

Uncertainty Quantification of Large Language Models using Approximate Bayesian Computation

Published: September 19, 2025 | arXiv ID: 2509.19375v1

By: Mridul Sharma , Adeetya Patel , Zaneta D' Souza and more

Potential Business Impact:

Helps AI know when it's unsure about answers.

Business Areas:

A/B Testing Data and Analytics

Despite their widespread applications, Large Language Models (LLMs) often struggle to express uncertainty, posing a challenge for reliable deployment in high stakes and safety critical domains like clinical diagnostics. Existing standard baseline methods such as model logits and elicited probabilities produce overconfident and poorly calibrated estimates. In this work, we propose Approximate Bayesian Computation (ABC), a likelihood-free Bayesian inference, based approach that treats LLMs as a stochastic simulator to infer posterior distributions over predictive probabilities. We evaluate our ABC approach on two clinically relevant benchmarks: a synthetic oral lesion diagnosis dataset and the publicly available GretelAI symptom-to-diagnosis dataset. Compared to standard baselines, our approach improves accuracy by up to 46.9\%, reduces Brier scores by 74.4\%, and enhances calibration as measured by Expected Calibration Error (ECE) and predictive entropy.

Robustifying Approximate Bayesian Computation

Methodology

Makes computer guesses better when the rules are wrong.

7 Apr 2025 0

90%

Confidence in Large Language Model Evaluation: A Bayesian Approach to Limited-Sample Challenges

Computation and Language

Tests AI better, even with less data.

30 Apr 2025 1

89%

The challenge of uncertainty quantification of large language models in medicine

Artificial Intelligence

Helps doctors know when AI is unsure about health advice.

7 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇦 Canada

Page Count

17 pages

Uncertainty Quantification of Large Language Models using Approximate Bayesian Computation

Helps AI know when it's unsure about answers.

Technical Abstract

Robustifying Approximate Bayesian Computation

Confidence in Large Language Model Evaluation: A Bayesian Approach to Limited-Sample Challenges

The challenge of uncertainty quantification of large language models in medicine