Score: 3

Can Linear Probes Measure LLM Uncertainty?

Published: October 5, 2025 | arXiv ID: 2510.04108v1

By: Ramzi Dakhmouche, Adrien Letellier, Hossein Gorji

Potential Business Impact:

Makes AI more sure about its answers.

Business Areas:
Quantum Computing Science and Engineering

Effective Uncertainty Quantification (UQ) represents a key aspect for reliable deployment of Large Language Models (LLMs) in automated decision-making and beyond. Yet, for LLM generation with multiple choice structure, the state-of-the-art in UQ is still dominated by the naive baseline given by the maximum softmax score. To address this shortcoming, we demonstrate that taking a principled approach via Bayesian statistics leads to improved performance despite leveraging the simplest possible model, namely linear regression. More precisely, we propose to train multiple Bayesian linear models, each predicting the output of a layer given the output of the previous one. Based on the obtained layer-level posterior distributions, we infer the global uncertainty level of the LLM by identifying a sparse combination of distributional features, leading to an efficient UQ scheme. Numerical experiments on various LLMs show consistent improvement over state-of-the-art baselines.

Country of Origin
🇨🇭 🇫🇷 France, Switzerland

Repos / Data Links

Page Count
15 pages

Category
Computer Science:
Machine Learning (CS)