Paragraph-based Transformer Pre-training for Multi-Sentence Inference
Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

TL;DR
This paper introduces a paragraph-based pre-training method for transformers that improves multi-sentence inference tasks like answer sentence selection and fact verification by modeling paragraph-level semantics.
Contribution
The paper proposes a novel pre-training objective that enhances transformer models for joint multi-sentence inference tasks, outperforming traditional pre-training methods.
Findings
Pre-trained transformers perform poorly on multi-candidate inference tasks without specialized pre-training.
The proposed paragraph-based pre-training significantly improves performance on multiple datasets.
Code and models are publicly available for further research.
Abstract
Inference tasks such as answer sentence selection (AS2) or fact verification are typically solved by fine-tuning transformer-based models as individual sentence-pair classifiers. Recent studies show that these tasks benefit from modeling dependencies across multiple candidate sentences jointly. In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks. We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences. Our evaluation on three AS2 and one fact verification datasets demonstrates the superiority of our pre-training technique over the traditional ones for transformers used as joint models for multi-candidate inference tasks, as well as when used as cross-encoders for sentence-pair formulations of these tasks. Our code and pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
