TyDi QA-WANA: A Benchmark for Information-Seeking Question Answering in Languages of West Asia and North Africa
Parker Riley, Siamak Shakeri, Waleed Ammar, Jonathan H. Clark

TL;DR
TyDi QA-WANA introduces a large, culturally relevant question-answering dataset for 10 languages of West Asia and North Africa, designed to evaluate models' ability to handle large texts in information-seeking scenarios.
Contribution
The paper presents a new multilingual dataset for information-seeking QA in West Asian and North African languages, collected without translation to ensure cultural relevance.
Findings
Baseline models show varying performance across languages.
The dataset enables evaluation of large-context question answering.
Code and data are publicly released for research use.
Abstract
We present TyDi QA-WANA, a question-answering dataset consisting of 28K examples divided among 10 language varieties of western Asia and northern Africa. The data collection process was designed to elicit information-seeking questions, where the asker is genuinely curious to know the answer. Each question in paired with an entire article that may or may not contain the answer; the relatively large size of the articles results in a task suitable for evaluating models' abilities to utilize large text contexts in answering questions. Furthermore, the data was collected directly in each language variety, without the use of translation, in order to avoid issues of cultural relevance. We present performance of two baseline models, and release our code and data to facilitate further improvement by the research community.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage, Linguistics, Cultural Analysis · Speech and dialogue systems
