VietMed-MCQ: A Consistency-Filtered Data Synthesis Framework for Vietnamese Traditional Medicine Evaluation
Huynh Trung Kiet, Dao Sy Duy Minh, Nguyen Dinh Ha Duong, Le Hoang Minh Huy, Long Nguyen, and Dien Dinh

TL;DR
This paper introduces VietMed-MCQ, a high-quality, consistency-checked multiple-choice question dataset for Vietnamese Traditional Medicine, enabling better evaluation of language models in this specialized, low-resource domain.
Contribution
We developed VietMed-MCQ using a retrieval-augmented generation pipeline with dual-model validation, creating a validated dataset for Vietnamese Traditional Medicine evaluation.
Findings
General models with Chinese priors outperform Vietnamese models.
All models struggle with complex diagnostic reasoning.
The dataset achieved 94.2% expert approval.
Abstract
Large Language Models (LLMs) have demonstrated remarkable proficiency in general medical domains. However, their performance significantly degrades in specialized, culturally specific domains such as Vietnamese Traditional Medicine (VTM), primarily due to the scarcity of high-quality, structured benchmarks. In this paper, we introduce VietMed-MCQ, a novel multiple-choice question dataset generated via a Retrieval-Augmented Generation (RAG) pipeline with an automated consistency check mechanism. Unlike previous synthetic datasets, our framework incorporates a dual-model validation approach to ensure reasoning consistency through independent answer verification, though the substring-based evidence checking has known limitations. The complete dataset of 3,190 questions spans three difficulty levels and underwent validation by one medical expert and four students, achieving 94.2 percent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraditional Chinese Medicine Studies · Topic Modeling · Machine Learning in Healthcare
