SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking
By: Dien X. Tran , Nam V. Nguyen , Thanh T. Tran and more
Potential Business Impact:
Fights fake news in Vietnamese, faster and better.
The rise of misinformation, exacerbated by Large Language Models (LLMs) like GPT and Gemini, demands robust fact-checking solutions, especially for low-resource languages like Vietnamese. Existing methods struggle with semantic ambiguity, homonyms, and complex linguistic structures, often trading accuracy for efficiency. We introduce SemViQA, a novel Vietnamese fact-checking framework integrating Semantic-based Evidence Retrieval (SER) and Two-step Verdict Classification (TVC). Our approach balances precision and speed, achieving state-of-the-art results with 78.97\% strict accuracy on ISE-DSC01 and 80.82\% on ViWikiFC, securing 1st place in the UIT Data Science Challenge. Additionally, SemViQA Faster improves inference speed 7x while maintaining competitive accuracy. SemViQA sets a new benchmark for Vietnamese fact verification, advancing the fight against misinformation. The source code is available at: https://github.com/DAVID-NGUYEN-S16/SemViQA.
Similar Papers
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
CV and Pattern Recognition
Tests if AI understands video facts correctly.
Bridging the Semantic Gaps: Improving Medical VQA Consistency with LLM-Augmented Question Sets
CV and Pattern Recognition
Helps doctors understand X-rays by asking questions.
VisualSimpleQA: A Benchmark for Decoupled Evaluation of Large Vision-Language Models in Fact-Seeking Question Answering
Computation and Language
Helps computers answer questions about pictures more truthfully.