Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection
Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

TL;DR
This paper introduces three novel sentence-level transformer pre-training objectives that leverage paragraph-level semantics to enhance answer sentence selection performance, especially with limited labeled data.
Contribution
It proposes new pre-training objectives incorporating paragraph semantics to improve transformer models for answer sentence selection tasks.
Findings
Pre-trained transformers outperform baselines like RoBERTa and ELECTRA on multiple AS2 datasets.
The proposed objectives effectively incorporate paragraph-level context.
Models show improved performance with less labeled data.
Abstract
An important task for designing QA systems is answer sentence selection (AS2): selecting the sentence containing (or constituting) the answer to a question from a set of retrieved relevant documents. In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets. Specifically, the model is tasked to predict whether: (i) two sentences are extracted from the same paragraph, (ii) a given sentence is extracted from a given paragraph, and (iii) two paragraphs are extracted from the same document. Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models such as RoBERTa and ELECTRA for AS2.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Linear Warmup With Linear Decay · Dropout · Attention Dropout · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Residual Connection
