Do not Abstain! Identify and Solve the Uncertainty
By: Jingyu Liu , Jingquan Peng , xiaopeng Wu and more
Potential Business Impact:
Helps AI admit when it doesn't know.
Despite the widespread application of Large Language Models (LLMs) across various domains, they frequently exhibit overconfidence when encountering uncertain scenarios, yet existing solutions primarily rely on evasive responses (e.g., "I don't know") overlooks the opportunity of identifying and addressing the uncertainty to generate more satisfactory responses. To systematically investigate and improve LLMs' ability of recognizing and addressing the source of uncertainty, we introduce \textbf{ConfuseBench}, a benchmark mainly focus on three types of uncertainty: document scarcity, limited capability, and query ambiguity. Experiments with ConfuseBench reveal that current LLMs struggle to accurately identify the root cause of uncertainty and solve it. They prefer to attribute uncertainty to query ambiguity while overlooking capability limitations, especially for those weaker models. To tackle this challenge, we first generate context-aware inquiries that highlight the confusing aspect of the original query. Then we judge the source of uncertainty based on the uniqueness of the inquiry's answer. Further we use an on-policy training method, InteractDPO to generate better inquiries. Experimental results demonstrate the efficacy of our approach.
Similar Papers
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
Computation and Language
Makes AI better at judging other AI.
Pretrained LLMs Learn Multiple Types of Uncertainty
Computation and Language
Makes AI know when it's unsure.
The Illusion of Certainty: Uncertainty quantification for LLMs fails under ambiguity
Machine Learning (CS)
Makes AI understand when it's unsure.