An Empirical Comparison of LM-based Question and Answer Generation   Methods

Asahi Ushio; Fernando Alva-Manchego; Jose Camacho-Collados

arXiv:2305.17002·cs.CL·May 29, 2023·2 cites

An Empirical Comparison of LM-based Question and Answer Generation Methods

Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados

PDF

Open Access 1 Repo

TL;DR

This paper empirically compares three sequence-to-sequence language model-based question-answer generation methods, demonstrating that a lightweight end-to-end model is robust and effective, with generated data aiding QA model training.

Contribution

It provides a comprehensive baseline comparison of LM-based QAG methods and shows the effectiveness of generated data for training QA models.

Findings

01

End-to-end QAG model outperforms more complex approaches.

02

Generated question-answer pairs can train competitive QA models.

03

Performance varies depending on the underlying language model.

Abstract

Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

asahi417/lm-question-generation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems