How Different are Pre-trained Transformers for Text Ranking?

David Rau; Jaap Kamps

arXiv:2204.07233·cs.IR·April 18, 2022·1 cites

How Different are Pre-trained Transformers for Text Ranking?

David Rau, Jaap Kamps

PDF

Open Access 1 Repo

TL;DR

This paper compares pre-trained transformer models to traditional ranking methods in passage retrieval, revealing significant differences in relevance notions and highlighting areas for future research.

Contribution

It provides a detailed analysis of how BERT-based neural rankers differ from traditional methods, especially in the context of MS Marco datasets.

Findings

01

Neural rankers often prioritize different relevance signals than traditional models.

02

BERT models can retrieve documents missed by traditional systems, enhancing recall.

03

Substantial differences in relevance understanding suggest avenues for improving neural ranking models.

Abstract

In recent years, large pre-trained transformers have led to substantial gains in performance over traditional retrieval models and feedback approaches. However, these results are primarily based on the MS Marco/TREC Deep Learning Track setup, with its very particular setup, and our understanding of why and how these models work better is fragmented at best. We analyze effective BERT-based cross-encoders versus traditional BM25 ranking for the passage retrieval task where the largest gains have been observed, and investigate two main questions. On the one hand, what is similar? To what extent does the neural ranker already encompass the capacity of traditional rankers? Is the gain in performance due to a better ranking of the same documents (prioritizing precision)? On the other hand, what is different? Can it retrieve effectively documents missed by traditional systems (prioritizing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

davidmrau/transformer-vs-bm25
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Linear Warmup With Linear Decay · Dense Connections · Attention Dropout · Layer Normalization · Softmax