Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and   Closed Book QA

Manuel R. Ciosici; Joe Cecil; Alex Hedges; Dong-Ho Lee; Marjorie; Freedman; Ralph Weischedel

arXiv:2110.01552·cs.CL·October 5, 2021

Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and Closed Book QA

Manuel R. Ciosici, Joe Cecil, Alex Hedges, Dong-Ho Lee, Marjorie, Freedman, Ralph Weischedel

PDF

Open Access

TL;DR

This paper introduces a new task and benchmark for evaluating pre-trained language models on understanding instructional texts, highlighting their limited zero-shot and knowledge transfer capabilities in open and closed book question answering.

Contribution

It proposes a novel educational question-answering task with a new dataset and leaderboard, assessing PTLMs' ability to understand and utilize textbook content in different settings.

Findings

01

PTLMs perform around 50-56% on the task, close to random chance.

02

Adding textbooks to pre-training yields minimal performance gains.

03

Open-book setting improves accuracy to about 60%.

Abstract

Our goal is to deliver a new task and leaderboard to stimulate research on question answering and pre-trained language models (PTLMs) to understand a significant instructional document, e.g., an introductory college textbook or a manual. PTLMs have shown great success in many question-answering tasks, given significant supervised training, but much less so in zero-shot settings. We propose a new task that includes two college-level introductory texts in the social sciences (American Government 2e) and humanities (U.S. History), hundreds of true/false statements based on review questions written by the textbook authors, validation/development tests based on the first eight chapters of the textbooks, blind tests based on the remaining textbook chapters, and baseline results given state-of-the-art PTLMs. Since the questions are balanced, random performance should be ~50%. T5, fine-tuned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsAttention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Byte Pair Encoding · SentencePiece · Dropout · Dense Connections · Softmax · Gated Linear Unit