Relative Position Prediction as Pre-training for Text Encoders
Rickard Br\"uel-Gabrielsson, Chris Scarvelis

TL;DR
This paper proposes a position-centric pre-training method for text encoders using relative position prediction, aiming to improve downstream task performance by capturing token topology more effectively.
Contribution
It introduces a novel relative position prediction pre-training paradigm for NLP, emphasizing token topology over traditional token identity-based objectives.
Findings
Superior performance on downstream NLP tasks
Effective encoding of token topology
Advantage over traditional MLM and CLM objectives
Abstract
Meaning is defined by the company it keeps. However, company is two-fold: It's based on the identity of tokens and also on their position (topology). We argue that a position-centric perspective is more general and useful. The classic MLM and CLM objectives in NLP are easily phrased as position predictions over the whole vocabulary. Adapting the relative position encoding paradigm in NLP to create relative labels for self-supervised learning, we seek to show superior pre-training judged by performance on downstream tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
