Can Linear Probes Measure LLM Uncertainty?
By: Ramzi Dakhmouche, Adrien Letellier, Hossein Gorji
Potential Business Impact:
Makes AI more sure about its answers.
Effective Uncertainty Quantification (UQ) represents a key aspect for reliable deployment of Large Language Models (LLMs) in automated decision-making and beyond. Yet, for LLM generation with multiple choice structure, the state-of-the-art in UQ is still dominated by the naive baseline given by the maximum softmax score. To address this shortcoming, we demonstrate that taking a principled approach via Bayesian statistics leads to improved performance despite leveraging the simplest possible model, namely linear regression. More precisely, we propose to train multiple Bayesian linear models, each predicting the output of a layer given the output of the previous one. Based on the obtained layer-level posterior distributions, we infer the global uncertainty level of the LLM by identifying a sparse combination of distributional features, leading to an efficient UQ scheme. Numerical experiments on various LLMs show consistent improvement over state-of-the-art baselines.
Similar Papers
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey
Computation and Language
Helps AI know when it's wrong.
Mind the Gap: Benchmarking LLM Uncertainty, Discrimination, and Calibration in Specialty-Aware Clinical QA
Computation and Language
Helps doctors trust AI answers in medicine.
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Computation and Language
Makes AI tell the truth, not make things up.