A Comparative Study of Transformer-Based Language Models on Extractive   Question Answering

Kate Pearce; Tiffany Zhan; Aneesh Komanduri; Justin Zhan

arXiv:2110.03142·cs.CL·October 8, 2021·22 cites

A Comparative Study of Transformer-Based Language Models on Extractive Question Answering

Kate Pearce, Tiffany Zhan, Aneesh Komanduri, Justin Zhan

PDF

Open Access

TL;DR

This study evaluates various transformer-based language models on extractive question answering datasets, highlighting RoBERTa and BART as top performers and introducing a new BERT-BiLSTM architecture that improves performance.

Contribution

The paper compares multiple pre-trained models on diverse datasets and proposes a novel BERT-BiLSTM architecture to enhance extractive QA performance.

Findings

01

RoBERTa and BART outperform other models across datasets

02

BERT-BiLSTM surpasses baseline BERT in accuracy

03

Models show varying generalizability depending on dataset difficulty

Abstract

Question Answering (QA) is a task in natural language processing that has seen considerable growth after the advent of transformers. There has been a surge in QA datasets that have been proposed to challenge natural language processing models to improve human and existing model performance. Many pre-trained language models have proven to be incredibly effective at the task of extractive question answering. However, generalizability remains as a challenge for the majority of these models. That is, some datasets require models to reason more than others. In this paper, we train various pre-trained language models and fine-tune them on multiple question answering datasets of varying levels of difficulty to determine which of the models are capable of generalizing the most comprehensively across different datasets. Further, we propose a new architecture, BERT-BiLSTM, and compare it with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Expert finding and Q&A systems

MethodsAttention Is All You Need · Linear Layer · Residual Connection · Weight Decay · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · WordPiece · Adam