Relative Position Prediction as Pre-training for Text Encoders

Rickard Br\"uel-Gabrielsson; Chris Scarvelis

arXiv:2202.01145·cs.CL·February 3, 2022

Relative Position Prediction as Pre-training for Text Encoders

Rickard Br\"uel-Gabrielsson, Chris Scarvelis

PDF

Open Access

TL;DR

This paper proposes a position-centric pre-training method for text encoders using relative position prediction, aiming to improve downstream task performance by capturing token topology more effectively.

Contribution

It introduces a novel relative position prediction pre-training paradigm for NLP, emphasizing token topology over traditional token identity-based objectives.

Findings

01

Superior performance on downstream NLP tasks

02

Effective encoding of token topology

03

Advantage over traditional MLM and CLM objectives

Abstract

Meaning is defined by the company it keeps. However, company is two-fold: It's based on the identity of tokens and also on their position (topology). We argue that a position-centric perspective is more general and useful. The classic MLM and CLM objectives in NLP are easily phrased as position predictions over the whole vocabulary. Adapting the relative position encoding paradigm in NLP to create relative labels for self-supervised learning, we seek to show superior pre-training judged by performance on downstream tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling