MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation
Adrien Bazoge

TL;DR
MediQAl is a French medical dataset for evaluating language models' ability to recall medical facts and reason through clinical scenarios.
Contribution
MediQAl introduces a new French medical QA dataset with diverse tasks and cognitive labels for evaluating factual recall and reasoning.
Findings
MediQAl contains 32,603 questions across 41 medical subjects with three types of tasks.
Evaluation with 14 large language models reveals a significant performance gap between factual recall and reasoning tasks.
The dataset provides a benchmark for French medical QA, addressing a multilingual resource gap.
Abstract
This work introduces MediQAl, a French medical question answering dataset designed to evaluate the capabilities of language models in factual medical recall and reasoning over real-world clinical scenarios. MediQAl contains 32,603 questions sourced from French medical examinations across 41 medical subjects. The dataset includes three tasks: (i) Multiple-Choice Question with Unique answer, (ii) Multiple-Choice Question with Multiple answer, and (iii) Open-Ended Question with Short-Answer. Each question is labeled as Understanding or Reasoning, enabling a detailed analysis of models’ cognitive capabilities. We validate the MediQAl dataset through extensive evaluation with 14 large language models, including recent reasoning-augmented models, and observe a significant performance gap between factual recall and reasoning tasks. Our evaluation provides a comprehensive benchmark for…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 10
Figure 11
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare
