Quinductor: a multilingual data-driven method for generating   reading-comprehension questions using Universal Dependencies

Dmytro Kalpakchi; Johan Boye

arXiv:2103.10121·cs.CL·May 16, 2023

Quinductor: a multilingual data-driven method for generating reading-comprehension questions using Universal Dependencies

Dmytro Kalpakchi, Johan Boye

PDF

Open Access 1 Repo

TL;DR

Quinductor is a multilingual, data-driven approach for generating reading comprehension questions using dependency trees, offering a resource-efficient alternative to neural models, with strong performance and good human evaluation results.

Contribution

It introduces a mostly deterministic, inexpensive baseline method for multilingual question generation that requires less data than neural approaches.

Findings

01

Outperforms previous QG baselines in literature

02

Requires significantly less training data

03

Achieves high human evaluation scores

Abstract

We propose a multilingual data-driven method for generating reading comprehension questions using dependency trees. Our method provides a strong, mostly deterministic, and inexpensive-to-train baseline for less-resourced languages. While a language-specific corpus is still required, its size is nowhere near those required by modern neural question generation (QG) architectures. Our method surpasses QG baselines previously reported in the literature and shows a good performance in terms of human evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dkalpakchi/quinductor
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification