CoT-MoTE: Exploring ConTextual Masked Auto-Encoder Pre-training with   Mixture-of-Textual-Experts for Passage Retrieval

Guangyuan Ma; Xing Wu; Peng Wang; Songlin Hu

arXiv:2304.10195·cs.CL·April 21, 2023·1 cites

CoT-MoTE: Exploring ConTextual Masked Auto-Encoder Pre-training with Mixture-of-Textual-Experts for Passage Retrieval

Guangyuan Ma, Xing Wu, Peng Wang, Songlin Hu

PDF

Open Access

TL;DR

This paper introduces CoT-MoTE, a novel pre-training method for passage retrieval that uses mixture-of-experts to better encode queries and passages, resulting in improved retrieval accuracy and balanced embeddings.

Contribution

It proposes a new pre-training approach combining textual-specific experts with shared attention for dual-encoder passage retrieval models.

Findings

01

Improved retrieval performance on large-scale benchmarks.

02

More balanced discrimination of embedding spaces.

03

Effective encoding of query and passage properties.

Abstract

Passage retrieval aims to retrieve relevant passages from large collections of the open-domain corpus. Contextual Masked Auto-Encoding has been proven effective in representation bottleneck pre-training of a monolithic dual-encoder for passage retrieval. Siamese or fully separated dual-encoders are often adopted as basic retrieval architecture in the pre-training and fine-tuning stages for encoding queries and passages into their latent embedding spaces. However, simply sharing or separating the parameters of the dual-encoder results in an imbalanced discrimination of the embedding spaces. In this work, we propose to pre-train Contextual Masked Auto-Encoder with Mixture-of-Textual-Experts (CoT-MoTE). Specifically, we incorporate textual-specific experts for individually encoding the distinct properties of queries and passages. Meanwhile, a shared self-attention layer is still kept for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications