EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering
Momchil Hardalov, Todor Mihaylov, Dimitrina Zlatkova, Yoan Dinkov,, Ivan Koychev, Preslav Nakov

TL;DR
EXAMS is a comprehensive multilingual dataset of over 24,000 high school exam questions across 16 languages and 24 subjects, designed to evaluate and advance cross-lingual question answering models.
Contribution
The paper introduces EXAMS, a new benchmark dataset for multilingual high school exam questions, enabling detailed evaluation of models across languages and subjects.
Findings
Existing models struggle with multilingual reasoning tasks.
EXAMS reveals significant challenges in knowledge transfer across languages.
Multilingual pre-trained models show varied performance across subjects.
Abstract
We propose EXAMS -- a new benchmark dataset for cross-lingual and multilingual question answering for high school examinations. We collected more than 24,000 high-quality high school exam questions in 16 languages, covering 8 language families and 24 school subjects from Natural Sciences and Social Sciences, among others. EXAMS offers a fine-grained evaluation framework across multiple languages and subjects, which allows precise analysis and comparison of various models. We perform various experiments with existing top-performing multilingual pre-trained models and we show that EXAMS offers multiple challenges that require multilingual knowledge and reasoning in multiple domains. We hope that EXAMS will enable researchers to explore challenging reasoning and knowledge transfer methods and pre-trained models for school question answering in various languages which was not possible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
