KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations
Sunjun Kweon, Byungjin Choi, Gyouk Chu, Junyeong Song, Daeun Hyeon,, Sujin Gan, Jueon Kim, Minkyu Kim, Rae Woong Park, Edward Choi

TL;DR
KorMedMCQA is a new Korean medical multiple-choice question benchmark derived from licensing exams, enabling evaluation of AI models in Korean healthcare contexts and highlighting the importance of region-specific benchmarks.
Contribution
This paper introduces KorMedMCQA, the first Korean medical QA benchmark, and evaluates various large language models, demonstrating the impact of reasoning techniques and the limitations of cross-region benchmarks.
Findings
Chain of Thought reasoning improves model performance by up to 4.5%.
MedQA does not reliably predict Korean medical QA performance.
Region-specific benchmarks are essential for accurate AI evaluation in healthcare.
Abstract
We present KorMedMCQA, the first Korean Medical Multiple-Choice Question Answering benchmark, derived from professional healthcare licensing examinations conducted in Korea between 2012 and 2024. The dataset contains 7,469 questions from examinations for doctor, nurse, pharmacist, and dentist, covering a wide range of medical disciplines. We evaluate the performance of 59 large language models, spanning proprietary and open-source models, multilingual and Korean-specialized models, and those fine-tuned for clinical applications. Our results show that applying Chain of Thought (CoT) reasoning can enhance the model performance by up to 4.5% compared to direct answering approaches. We also investigate whether MedQA, one of the most widely used medical benchmarks derived from the U.S. Medical Licensing Examination, can serve as a reliable proxy for evaluating model performance in other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPharmacy and Medical Practices
MethodsALIGN
