Read Your Own Mind: Reasoning Helps Surface Self-Confidence Signals in LLMs
By: Jakub Podolak, Rajeev Verma
Potential Business Impact:
Makes AI more honest about what it knows.
We study the source of uncertainty in DeepSeek R1-32B by analyzing its self-reported verbal confidence on question answering (QA) tasks. In the default answer-then-confidence setting, the model is regularly over-confident, whereas semantic entropy - obtained by sampling many responses - remains reliable. We hypothesize that this is because of semantic entropy's larger test-time compute, which lets us explore the model's predictive distribution. We show that granting DeepSeek the budget to explore its distribution by forcing a long chain-of-thought before the final answer greatly improves its verbal score effectiveness, even on simple fact-retrieval questions that normally require no reasoning. Furthermore, a separate reader model that sees only the chain can reconstruct very similar confidences, indicating the verbal score might be merely a statistic of the alternatives surfaced during reasoning. Our analysis concludes that reliable uncertainty estimation requires explicit exploration of the generative space, and self-reported confidence is trustworthy only after such exploration.
Similar Papers
Don't Miss the Forest for the Trees: In-Depth Confidence Estimation for LLMs via Reasoning over the Answer Space
Computation and Language
Helps AI know how sure it is about answers.
Measuring Reasoning Utility in LLMs via Conditional Entropy Reduction
Computation and Language
Helps computers know when their thinking is wrong.
Open the Oyster: Empirical Evaluation and Improvement of Code Reasoning Confidence in LLMs
Software Engineering
Makes AI better at knowing when it's right.