ANNA: Enhanced Language Representation for Question Answering

Changwook Jun; Hansol Jang; Myoseop Sim; Hyun Kim; Jooyoung Choi,; Kyungkoo Min; Kyunghoon Bae

arXiv:2203.14507·cs.CL·April 5, 2022

ANNA: Enhanced Language Representation for Question Answering

Changwook Jun, Hansol Jang, Myoseop Sim, Hyun Kim, Jooyoung Choi,, Kyungkoo Min, Kyunghoon Bae

PDF

Open Access

TL;DR

This paper introduces ANNA, an enhanced pre-trained language model with a neighbor-aware mechanism and extended pre-training tasks, achieving state-of-the-art results on question answering benchmarks like SQuAD 1.1 and 2.0.

Contribution

The paper presents a novel neighbor-aware mechanism and an extended pre-training task that, when combined, improve language model performance on question answering tasks.

Findings

01

Achieved 95.7% F1 and 90.6% EM on SQuAD 1.1

02

Outperformed models like RoBERTa, ALBERT, ELECTRA, XLNet on SQuAD 2.0

03

Demonstrated the effectiveness of joint pre-training approaches

Abstract

Pre-trained language models have brought significant improvements in performance in a variety of natural language processing tasks. Most existing models performing state-of-the-art results have shown their approaches in the separate perspectives of data processing, pre-training tasks, neural network modeling, or fine-tuning. In this paper, we demonstrate how the approaches affect performance individually, and that the language model performs the best results on a specific question answering task when those approaches are jointly considered in pre-training models. In particular, we propose an extended pre-training task, and a new neighbor-aware mechanism that attends neighboring tokens more to capture the richness of context for pre-training language modeling. Our best model achieves new state-of-the-art results of 95.7\% F1 and 90.6\% EM on SQuAD 1.1 and also outperforms existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia? · SentencePiece · LAMB · BERT · Dropout · Dense Connections