YES SIR!Optimizing Semantic Space of Negatives with Self-Involvement   Ranker

Ruizhi Pu; Xinyu Zhang; Ruofei Lai; Zikai Guo; Yinxia Zhang; Hao; Jiang; Yongkang Wu; Yantao Jia; Zhicheng Dou; Zhao Cao

arXiv:2109.06436·cs.IR·September 15, 2021

YES SIR!Optimizing Semantic Space of Negatives with Self-Involvement Ranker

Ruizhi Pu, Xinyu Zhang, Ruofei Lai, Zikai Guo, Yinxia Zhang, Hao, Jiang, Yongkang Wu, Yantao Jia, Zhicheng Dou, Zhao Cao

PDF

Open Access

TL;DR

This paper introduces Self-Involvement Ranker (SIR), a novel fine-tuning strategy for document ranking that dynamically selects hard negative samples to improve semantic space quality and ranking performance of pre-trained models.

Contribution

The paper proposes SIR, a lightweight, general framework that adaptively selects hard negatives using supervisory signals, achieving state-of-the-art results on MS MARCO document ranking.

Findings

01

SIR significantly improves ranking performance across models.

02

SIR sets new SOTA on MS MARCO leaderboard.

03

Dynamic negative sampling enhances semantic space quality.

Abstract

Pre-trained model such as BERT has been proved to be an effective tool for dealing with Information Retrieval (IR) problems. Due to its inspiring performance, it has been widely used to tackle with real-world IR problems such as document ranking. Recently, researchers have found that selecting "hard" rather than "random" negative samples would be beneficial for fine-tuning pre-trained models on ranking tasks. However, it remains elusive how to leverage hard negative samples in a principled way. To address the aforementioned issues, we propose a fine-tuning strategy for document ranking, namely Self-Involvement Ranker (SIR), to dynamically select hard negative samples to construct high-quality semantic space for training a high-quality ranking model. Specifically, SIR consists of sequential compressors implemented with pre-trained models. Front compressor selects hard negative samples…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Weight Decay · Attention Dropout · Dropout · Layer Normalization · Softmax · Residual Connection