BERT-based model for Vietnamese Fact Verification Dataset
By: Bao Tran , T. N. Khanh , Khang Nguyen Tuong and more
Potential Business Impact:
Helps check if Vietnamese news is true.
The rapid advancement of information and communication technology has facilitated easier access to information. However, this progress has also necessitated more stringent verification measures to ensure the accuracy of information, particularly within the context of Vietnam. This paper introduces an approach to address the challenges of Fact Verification using the Vietnamese dataset by integrating both sentence selection and classification modules into a unified network architecture. The proposed approach leverages the power of large language models by utilizing pre-trained PhoBERT and XLM-RoBERTa as the backbone of the network. The proposed model was trained on a Vietnamese dataset, named ISE-DSC01, and demonstrated superior performance compared to the baseline model across all three metrics. Notably, we achieved a Strict Accuracy level of 75.11\%, indicating a remarkable 28.83\% improvement over the baseline model.
Similar Papers
SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking
Computation and Language
Fights fake news in Vietnamese, faster and better.
An Encoder-Integrated PhoBERT with Graph Attention for Vietnamese Token-Level Classification
Computation and Language
Helps computers understand Vietnamese text better.
Zero-Shot Text-to-Speech for Vietnamese
Computation and Language
Makes computers speak Vietnamese like people.