Adversarial Retriever-Ranker for dense text retrieval

Hang Zhang; Yeyun Gong; Yelong Shen; Jiancheng Lv; Nan Duan; Weizhu; Chen

arXiv:2110.03611·cs.CL·November 1, 2022·37 cites

Adversarial Retriever-Ranker for dense text retrieval

Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu, Chen

PDF

Open Access 1 Repo 1 Video

TL;DR

The paper introduces AR2, an adversarial training framework combining a retriever and ranker to enhance dense text retrieval, outperforming existing methods on multiple benchmarks.

Contribution

It proposes a novel adversarial training approach with a dual-encoder retriever and cross-encoder ranker for improved dense retrieval performance.

Findings

01

AR2 achieves state-of-the-art results on three benchmarks.

02

It significantly improves recall and ranking metrics.

03

The adversarial training leads to harder negatives and better models.

Abstract

Current dense text retrieval models face two typical challenges. First, they adopt a siamese dual-encoder architecture to encode queries and documents independently for fast indexing and searching, while neglecting the finer-grained term-wise interactions. This results in a sub-optimal recall performance. Second, their model training highly relies on a negative sampling technique to build up the negative documents in their contrastive losses. To address these challenges, we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker. The two models are jointly optimized according to a minimax adversarial objective: the retriever learns to retrieve negative documents to cheat the ranker, while the ranker learns to rank a collection of candidates including both the ground-truth and the retrieved ones, as well as providing progressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/ar2
pytorchOfficial

Videos

Adversarial Retriever-Ranker for Dense Text Retrieval· slideslive

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications