MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu

TL;DR
MedMCQA is a comprehensive large-scale dataset of over 194,000 medical multiple-choice questions from entrance exams, designed to evaluate AI models' understanding across diverse medical topics and reasoning skills.
Contribution
The paper introduces MedMCQA, a novel, extensive dataset for medical question answering that covers a wide range of topics and reasoning challenges, facilitating advanced AI research in medical NLP.
Findings
High topical diversity in questions
Effective evaluation of reasoning abilities
Large-scale dataset for medical AI research
Abstract
This paper introduces MedMCQA, a new large-scale, Multiple-Choice Question Answering (MCQA) dataset designed to address real-world medical entrance exam questions. More than 194k high-quality AIIMS \& NEET PG entrance exam MCQs covering 2.4k healthcare topics and 21 medical subjects are collected with an average token length of 12.77 and high topical diversity. Each sample contains a question, correct answer(s), and other options which requires a deeper language understanding as it tests the 10+ reasoning abilities of a model across a wide range of medical subjects \& topics. A detailed explanation of the solution, along with the above information, is provided in this study.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗google/medgemma-1.5-4b-itmodel· 86k dl· ♡ 53686k dl♡ 536
- 🤗unsloth/medgemma-1.5-4b-it-GGUFmodel· 6.7k dl· ♡ 336.7k dl♡ 33
- 🤗dmis-lab/meerkat-7b-v1.0model· 71 dl· ♡ 2871 dl♡ 28
- 🤗dmis-lab/llama-3-meerkat-8b-v1.0model· 274 dl· ♡ 8274 dl♡ 8
- 🤗dmis-lab/llama-3-meerkat-70b-v1.0model· 10 dl· ♡ 610 dl♡ 6
- 🤗RichardErkhov/dmis-lab_-_llama-3-meerkat-8b-v1.0-ggufmodel· 212 dl212 dl
- 🤗RichardErkhov/dmis-lab_-_llama-3-meerkat-70b-v1.0-ggufmodel· 69 dl69 dl
- 🤗RichardErkhov/dmis-lab_-_meerkat-7b-v1.0-ggufmodel· 266 dl266 dl
- 🤗RichardErkhov/dmis-lab_-_llama-3-meerkat-8b-v1.0-8bitsmodel· 1 dl1 dl
- 🤗RichardErkhov/dmis-lab_-_llama-3-meerkat-8b-v1.0-awqmodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
