HEAD-QA: A Healthcare Dataset for Complex Reasoning
David Vilares, Carlos G\'omez-Rodr\'iguez

TL;DR
HEAD-QA is a challenging healthcare question answering dataset designed to promote research on complex reasoning, highlighting gaps in current methods and serving as a benchmark for future advancements.
Contribution
The paper introduces HEAD-QA, a novel multilingual dataset for complex healthcare reasoning, and evaluates current models, revealing significant performance gaps.
Findings
HEAD-QA challenges existing QA methods
Current models perform significantly worse than humans
HEAD-QA serves as an effective benchmark for future research
Abstract
We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning. The questions come from exams to access a specialized position in the Spanish healthcare system, and are challenging even for highly specialized humans. We then consider monolingual (Spanish) and cross-lingual (to English) experiments with information retrieval and neural techniques. We show that: (i) HEAD-QA challenges current methods, and (ii) the results lag well behind human performance, demonstrating its usefulness as a benchmark for future work.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
