Score: 1

Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models

Published: December 4, 2025 | arXiv ID: 2512.04351v1

By: Manh Nguyen, Sunil Gupta, Hung Le

Potential Business Impact:

Tells if AI is unsure about its answers.

Business Areas:

RFID Hardware

Detecting when large language models (LLMs) are uncertain is critical for building reliable systems, yet existing methods are overly complicated, relying on brittle semantic clustering or internal states. We introduce \textbf{Radial Dispersion Score (RDS)}, a simple, parameter-free, fully model-agnostic uncertainty metric that measures the radial dispersion of sampled generations in embedding space. A lightweight probability-weighted variant further incorporates the model's own token probabilities when available, outperforming different nine strong baselines. Moroever, RDS naturally extends to per-sample scoring, enabling applications such as best-of-$N$ selection and confidence-based filtering. Across four challenging free-form QA datasets and multiple LLMs, our metrics achieve state-of-the-art hallucination detection and answer selection performance, while remaining robust and scalable with respect to sample size and embedding choice.

Measuring Aleatoric and Epistemic Uncertainty in LLMs: Empirical Evaluation on ID and OOD QA Tasks

Computation and Language

Helps computers know when they are unsure.

5 Nov 2025 0

87%

Confidence and Dispersity as Signals: Unsupervised Model Evaluation and Ranking

Machine Learning (CS)

Helps computers check their own work without answers.

3 Oct 2025 0

86%

D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs

Computation and Language

Stops AI from making up wrong information.

15 Sep 2025 0

View PDF Login to Bookmark

Country of Origin

🇦🇺 Australia

Page Count

15 pages

Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models

Tells if AI is unsure about its answers.

Technical Abstract

Measuring Aleatoric and Epistemic Uncertainty in LLMs: Empirical Evaluation on ID and OOD QA Tasks

Confidence and Dispersity as Signals: Unsupervised Model Evaluation and Ranking

D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs