Pre-training Transformer Models with Sentence-Level Objectives for   Answer Sentence Selection

Luca Di Liello; Siddhant Garg; Luca Soldaini; Alessandro Moschitti

arXiv:2205.10455·cs.CL·October 21, 2022·5 cites

Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection

Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

PDF

Open Access

TL;DR

This paper introduces three novel sentence-level transformer pre-training objectives that leverage paragraph-level semantics to enhance answer sentence selection performance, especially with limited labeled data.

Contribution

It proposes new pre-training objectives incorporating paragraph semantics to improve transformer models for answer sentence selection tasks.

Findings

01

Pre-trained transformers outperform baselines like RoBERTa and ELECTRA on multiple AS2 datasets.

02

The proposed objectives effectively incorporate paragraph-level context.

03

Models show improved performance with less labeled data.

Abstract

An important task for designing QA systems is answer sentence selection (AS2): selecting the sentence containing (or constituting) the answer to a question from a set of retrieved relevant documents. In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets. Specifically, the model is tasked to predict whether: (i) two sentences are extracted from the same paragraph, (ii) a given sentence is extracted from a given paragraph, and (iii) two paragraphs are extracted from the same document. Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models such as RoBERTa and ELECTRA for AS2.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Linear Warmup With Linear Decay · Dropout · Attention Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Residual Connection