Score: 0

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Published: January 9, 2026 | arXiv ID: 2601.05905v1

By: Haoming Xu , Ningyuan Zhao , Yunzhi Yao and more

Potential Business Impact:

Makes AI more honest when it's confused.

Business Areas:
Semantic Search Internet Services

As Large Language Models (LLMs) are increasingly deployed in real-world settings, correctness alone is insufficient. Reliable deployment requires maintaining truthful beliefs under contextual perturbations. Existing evaluations largely rely on point-wise confidence like Self-Consistency, which can mask brittle belief. We show that even facts answered with perfect self-consistency can rapidly collapse under mild contextual interference. To address this gap, we propose Neighbor-Consistency Belief (NCB), a structural measure of belief robustness that evaluates response coherence across a conceptual neighborhood. To validate the efficiency of NCB, we introduce a new cognitive stress-testing protocol that probes outputs stability under contextual interference. Experiments across multiple LLMs show that the performance of high-NCB data is relatively more resistant to interference. Finally, we present Structure-Aware Training (SAT), which optimizes context-invariant belief structure and reduces long-tail knowledge brittleness by approximately 30%. Code will be available at https://github.com/zjunlp/belief.

Country of Origin
🇨🇳 China

Page Count
26 pages

Category
Computer Science:
Computation and Language