Choose Your QA Model Wisely: A Systematic Study of Generative and Extractive Readers for Question Answering
Man Luo, Kazuma Hashimoto, Semih Yavuz, Zhiwei Liu, Chitta Baral,, Yingbo Zhou

TL;DR
This paper systematically compares extractive and generative question answering models using nine transformer-based PrLMs, revealing their relative strengths in different contexts and guiding future research directions.
Contribution
First comprehensive study comparing extractive and generative QA readers across multiple transformer architectures, providing insights into their strengths and weaknesses.
Findings
Generative readers excel in long context QA.
Extractive readers perform better in short context and out-of-domain settings.
Encoder of encoder-decoder PrLMs like T5 is a strong extractive reader.
Abstract
While both extractive and generative readers have been successfully applied to the Question Answering (QA) task, little attention has been paid toward the systematic comparison of them. Characterizing the strengths and weaknesses of the two readers is crucial not only for making a more informed reader selection in practice but also for developing a deeper understanding to foster further research on improving readers in a principled manner. Motivated by this goal, we make the first attempt to systematically study the comparison of extractive and generative readers for question answering. To be aligned with the state-of-the-art, we explore nine transformer-based large pre-trained language models (PrLMs) as backbone architectures. Furthermore, we organize our findings under two main categories: (1) keeping the architecture invariant, and (2) varying the underlying PrLMs. Among several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
