ConfQA: Answer Only If You Are Confident
By: Yin Huang , Yifan Ethan Xu , Kai Sun and more
Potential Business Impact:
Teaches AI to say "I don't know"
Can we teach Large Language Models (LLMs) to refrain from hallucinating factual statements? In this paper we present a fine-tuning strategy that we call ConfQA, which can reduce hallucination rate from 20-40% to under 5% across multiple factuality benchmarks. The core idea is simple: when the LLM answers a question correctly, it is trained to continue with the answer; otherwise, it is trained to admit "I am unsure". But there are two key factors that make the training highly effective. First, we introduce a dampening prompt "answer only if you are confident" to explicitly guide the behavior, without which hallucination remains high as 15%-25%. Second, we leverage simple factual statements, specifically attribute values from knowledge graphs, to help LLMs calibrate the confidence, resulting in robust generalization across domains and question types. Building on this insight, we propose the Dual Neural Knowledge framework, which seamlessly select between internally parameterized neural knowledge and externally recorded symbolic knowledge based on ConfQA's confidence. The framework enables potential accuracy gains to beyond 95%, while reducing unnecessary external retrievals by over 30%.
Similar Papers
Multi-Modal Fact-Verification Framework for Reducing Hallucinations in Large Language Models
Artificial Intelligence
Fixes AI lies to make it more truthful.
Uncertainty-Aware Fusion: An Ensemble Framework for Mitigating Hallucinations in Large Language Models
Machine Learning (CS)
Makes AI answer questions truthfully, not make things up.
Thinking, Faithful and Stable: Mitigating Hallucinations in LLMs
Artificial Intelligence
Makes AI think more carefully and be more truthful.