VietMed-MCQ: A Consistency-Filtered Data Synthesis Framework for Vietnamese Traditional Medicine Evaluation
By: Huynh Trung Kiet , Dao Sy Duy Minh , Nguyen Dinh Ha Duong and more
Potential Business Impact:
Helps computers understand old Vietnamese medicine.
Large Language Models (LLMs) have demonstrated remarkable proficiency in general medical domains. However, their performance significantly degrades in specialized, culturally specific domains such as Vietnamese Traditional Medicine (VTM), primarily due to the scarcity of high-quality, structured benchmarks. In this paper, we introduce VietMed-MCQ, a novel multiple-choice question dataset generated via a Retrieval-Augmented Generation (RAG) pipeline with an automated consistency check mechanism. Unlike previous synthetic datasets, our framework incorporates a dual-model validation approach to ensure reasoning consistency through independent answer verification, though the substring-based evidence checking has known limitations. The complete dataset of 3,190 questions spans three difficulty levels and underwent validation by one medical expert and four students, achieving 94.2 percent approval with substantial inter-rater agreement (Fleiss' kappa = 0.82). We benchmark seven open-source models on VietMed-MCQ. Results reveal that general-purpose models with strong Chinese priors outperform Vietnamese-centric models, highlighting cross-lingual conceptual transfer, while all models still struggle with complex diagnostic reasoning. Our code and dataset are publicly available to foster research in low-resource medical domains.
Similar Papers
Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive Learning
Computation and Language
Makes AI doctors understand questions better.
VM14K: First Vietnamese Medical Benchmark
Computation and Language
Helps doctors understand Vietnamese patients better.
VisTW: Benchmarking Vision-Language Models for Traditional Chinese in Taiwan
Computation and Language
Tests computers understanding Chinese pictures and words.