Score: 3

Can Linear Probes Measure LLM Uncertainty?

Published: October 5, 2025 | arXiv ID: 2510.04108v1

By: Ramzi Dakhmouche, Adrien Letellier, Hossein Gorji

Potential Business Impact:

Makes AI more sure about its answers.

Business Areas:

Quantum Computing Science and Engineering

Effective Uncertainty Quantification (UQ) represents a key aspect for reliable deployment of Large Language Models (LLMs) in automated decision-making and beyond. Yet, for LLM generation with multiple choice structure, the state-of-the-art in UQ is still dominated by the naive baseline given by the maximum softmax score. To address this shortcoming, we demonstrate that taking a principled approach via Bayesian statistics leads to improved performance despite leveraging the simplest possible model, namely linear regression. More precisely, we propose to train multiple Bayesian linear models, each predicting the output of a layer given the output of the previous one. Based on the obtained layer-level posterior distributions, we infer the global uncertainty level of the LLM by identifying a sparse combination of distributional features, leading to an efficient UQ scheme. Numerical experiments on various LLMs show consistent improvement over state-of-the-art baselines.

Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey

Computation and Language

Helps AI know when it's wrong.

20 Mar 2025 0

90%

Mind the Gap: Benchmarking LLM Uncertainty, Discrimination, and Calibration in Specialty-Aware Clinical QA

Computation and Language

Helps doctors trust AI answers in medicine.

12 Jun 2025 0

90%

Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review

Computation and Language

Makes AI tell the truth, not make things up.

25 Apr 2025 2

View PDF Login to Bookmark

Country of Origin

🇨🇭 🇫🇷 France, Switzerland

Repos / Data Links

github.com github.com

Page Count

15 pages

Can Linear Probes Measure LLM Uncertainty?

Makes AI more sure about its answers.

Technical Abstract

Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey

Mind the Gap: Benchmarking LLM Uncertainty, Discrimination, and Calibration in Specialty-Aware Clinical QA

Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review