Question Generation by Transformers
Kettip Kriangchaivech, Artit Wangperawong

TL;DR
This paper presents a transformer-based model for automatic question generation from Wikipedia passages, trained on the SQuAD dataset, capable of producing grammatically correct questions with reasonable relevance.
Contribution
It introduces a transformer-based approach for question generation that outperforms RNNs and demonstrates effective question creation on Wikipedia passages.
Findings
Generated questions are mostly grammatically correct.
Questions differ from original SQuAD questions but are plausible.
Model achieves an average of 8 words per question.
Abstract
A machine learning model was developed to automatically generate questions from Wikipedia passages using transformers, an attention-based model eschewing the paradigm of existing recurrent neural networks (RNNs). The model was trained on the inverted Stanford Question Answering Dataset (SQuAD), which is a reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles. After training, the question generation model is able to generate simple questions relevant to unseen passages and answers containing an average of 8 words per question. The word error rate (WER) was used as a metric to compare the similarity between SQuAD questions and the model-generated questions. Although the high average WER suggests that the questions generated differ from the original SQuAD questions, the questions generated are mostly grammatically correct and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
