RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training   Retrieval-Oriented Language Models

Shitao Xiao; Zheng Liu; Yingxia Shao; Zhao Cao

arXiv:2305.02564·cs.CL·May 5, 2023·1 cites

RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models

Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao

PDF

Open Access 1 Repo

TL;DR

RetroMAE-2 introduces a duplex auto-encoding pre-training method that enhances semantic representations by jointly leveraging all token embeddings, significantly improving retrieval performance on benchmarks like MS MARCO and BEIR.

Contribution

It proposes Duplex Masked Auto-Encoder (DupMAE), a novel pre-training approach that jointly trains on two tasks to utilize all contextualized embeddings for retrieval tasks.

Findings

01

Improves semantic representation quality for retrieval models.

02

Achieves superior performance on MS MARCO and BEIR benchmarks.

03

Enhances transferability of pre-trained language models.

Abstract

To better support information retrieval tasks such as web search and open-domain question answering, growing effort is made to develop retrieval-oriented language models, e.g., RetroMAE and many others. Most of the existing works focus on improving the semantic representation capability for the contextualized embedding of the [CLS] token. However, recent study shows that the ordinary tokens besides [CLS] may provide extra information, which help to produce a better representation effect. As such, it's necessary to extend the current methods where all contextualized embeddings can be jointly pre-trained for the retrieval tasks. In this work, we propose a novel pre-training method called Duplex Masked Auto-Encoder, a.k.a. DupMAE. It is designed to improve the quality of semantic representation where all contextualized embeddings of the pre-trained model can be leveraged. It takes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

staoxiao/retromae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques