MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation

Adrien Bazoge

PMC · DOI:10.1038/s41597-026-06680-y·February 5, 2026

MediQAl: A French Medical Question Answering Dataset for Knowledge and Reasoning Evaluation

Adrien Bazoge

PDF

Open Access

TL;DR

MediQAl is a French medical dataset for evaluating language models' ability to recall medical facts and reason through clinical scenarios.

Contribution

MediQAl introduces a new French medical QA dataset with diverse tasks and cognitive labels for evaluating factual recall and reasoning.

Findings

01

MediQAl contains 32,603 questions across 41 medical subjects with three types of tasks.

02

Evaluation with 14 large language models reveals a significant performance gap between factual recall and reasoning tasks.

03

The dataset provides a benchmark for French medical QA, addressing a multilingual resource gap.

Abstract

This work introduces MediQAl, a French medical question answering dataset designed to evaluate the capabilities of language models in factual medical recall and reasoning over real-world clinical scenarios. MediQAl contains 32,603 questions sourced from French medical examinations across 41 medical subjects. The dataset includes three tasks: (i) Multiple-Choice Question with Unique answer, (ii) Multiple-Choice Question with Multiple answer, and (iii) Open-Ended Question with Short-Answer. Each question is labeled as Understanding or Reasoning, enabling a detailed analysis of models’ cognitive capabilities. We validate the MediQAl dataset through extensive evaluation with 14 large language models, including recent reasoning-augmented models, and observe a significant performance gap between factual recall and reasoning tasks. Our evaluation provides a comprehensive benchmark for…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species1

Homo sapiens(human · species)

Chemicals1

GPT-4o

Diseases1

LLMs

Figures11

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare