Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Luca Di Liello; Siddhant Garg; Luca Soldaini; Alessandro Moschitti

arXiv:2205.01228·cs.CL·July 8, 2022

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

PDF

Open Access 1 Repo

TL;DR

This paper introduces a paragraph-based pre-training method for transformers that improves multi-sentence inference tasks like answer sentence selection and fact verification by modeling paragraph-level semantics.

Contribution

The paper proposes a novel pre-training objective that enhances transformer models for joint multi-sentence inference tasks, outperforming traditional pre-training methods.

Findings

01

Pre-trained transformers perform poorly on multi-candidate inference tasks without specialized pre-training.

02

The proposed paragraph-based pre-training significantly improves performance on multiple datasets.

03

Code and models are publicly available for further research.

Abstract

Inference tasks such as answer sentence selection (AS2) or fact verification are typically solved by fine-tuning transformer-based models as individual sentence-pair classifiers. Recent studies show that these tasks benefit from modeling dependencies across multiple candidate sentences jointly. In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks. We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences. Our evaluation on three AS2 and one fact verification datasets demonstrates the superiority of our pre-training technique over the traditional ones for transformers used as joint models for multi-candidate inference tasks, as well as when used as cross-encoders for sentence-pair formulations of these tasks. Our code and pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-research/wqa-multi-sentence-inference
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification