FVA-RAG: Falsification-Verification Alignment for Mitigating Sycophantic Hallucinations
By: Mayank Ravishankara
Potential Business Impact:
Stops AI from believing fake news.
Retrieval-Augmented Generation (RAG) systems have significantly reduced hallucinations in Large Language Models (LLMs) by grounding responses in external context. However, standard RAG architectures suffer from a critical vulnerability: Retrieval Sycophancy. When presented with a query based on a false premise or a common misconception, vector-based retrievers tend to fetch documents that align with the user's bias rather than objective truth, leading the model to "hallucinate with citations." In this work, we introduce Falsification-Verification Alignment RAG (FVA-RAG), a framework that shifts the retrieval paradigm from Inductive Verification (seeking support) to Deductive Falsification (seeking disproof). Unlike existing "Self-Correction" methods that rely on internal consistency, FVA-RAG deploys a distinct Adversarial Retrieval Policy that actively generates "Kill Queries"-targeted search terms designed to surface contradictory evidence. We introduce a dual-verification mechanism that explicitly weighs the draft answer against this "Anti-Context." Preliminary experiments on a dataset of common misconceptions demonstrate that FVA-RAG significantly improves robustness against sycophantic hallucinations compared to standard RAG baselines, effectively acting as an inference-time "Red Team" for factual generation.
Similar Papers
MedTrust-RAG: Evidence Verification and Trust Alignment for Biomedical Question Answering
Computation and Language
Makes AI answer medical questions truthfully.
FAIR-RAG: Faithful Adaptive Iterative Refinement for Retrieval-Augmented Generation
Computation and Language
Helps AI answer hard questions with more facts.
MedTrust-RAG: Evidence Verification and Trust Alignment for Biomedical Question Answering
Computation and Language
Answers medical questions more truthfully.