Exploring Neural Net Augmentation to BERT for Question Answering on   SQUAD 2.0

Suhas Gupta

arXiv:1908.01767·cs.CL·March 10, 2020·1 cites

Exploring Neural Net Augmentation to BERT for Question Answering on SQUAD 2.0

Suhas Gupta

PDF

Open Access

TL;DR

This paper investigates augmenting BERT with various neural network architectures, including a contextualized CNN, to improve question answering performance on SQUAD 2.0, demonstrating enhanced accuracy on answerable and unanswerable questions.

Contribution

It introduces neural network augmentations to BERT for question answering and compares their effectiveness, highlighting the contextualized CNN as the most successful architecture.

Findings

01

Contextualized CNN achieved F1 scores of 75.32 on unanswerable questions.

02

Fine-tuning BERT improves adaptation to question answering tasks.

03

Augmentations enhance BERT's performance on SQUAD 2.0.

Abstract

Enhancing machine capabilities to answer questions has been a topic of considerable focus in recent years of NLP research. Language models like Embeddings from Language Models (ELMo)[1] and Bidirectional Encoder Representations from Transformers (BERT) [2] have been very successful in developing general purpose language models that can be optimized for a large number of downstream language tasks. In this work, we focused on augmenting the pre-trained BERT language model with different output neural net architectures and compared their performance on question answering task posed by the Stanford Question Answering Dataset 2.0 (SQUAD 2.0) [3]. Additionally, we also fine-tuned the pre-trained BERT model parameters to demonstrate its effectiveness in adapting to specialized language tasks. Our best output network, is the contextualized CNN that performs on both the unanswerable and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax