Long Context Question Answering via Supervised Contrastive Learning

Avi Caciularu; Ido Dagan; Jacob Goldberger; Arman Cohan

arXiv:2112.08777·cs.CL·May 9, 2022

Long Context Question Answering via Supervised Contrastive Learning

Avi Caciularu, Ido Dagan, Jacob Goldberger, Arman Cohan

PDF

Open Access 2 Repos

TL;DR

This paper introduces a contrastive learning approach to improve long-context question answering by better identifying supporting evidence, leading to enhanced performance on benchmark datasets.

Contribution

It proposes a novel sequence-level contrastive supervision method for long-context QA models, improving evidence identification and overall accuracy.

Findings

01

Consistent performance improvements on HotpotQA and QAsper benchmarks.

02

Enhanced evidence sentence discrimination through contrastive loss.

03

Improved long-context QA accuracy across multiple transformer models.

Abstract

Long-context question answering (QA) tasks require reasoning over a long document or multiple documents. Addressing these tasks often benefits from identifying a set of evidence spans (e.g., sentences), which provide supporting evidence for answering the question. In this work, we propose a novel method for equipping long-context QA models with an additional sequence-level objective for better identification of the supporting evidence. We achieve this via an additional contrastive supervision signal in finetuning, where the model is encouraged to explicitly discriminate supporting evidence sentences from negative ones by maximizing question-evidence similarity. The proposed additional loss exhibits consistent improvements on three different strong long-context transformer models, across two challenging question answering benchmarks -- HotpotQA and QAsper.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications