Not All Relevance Scores are Equal: Efficient Uncertainty and   Calibration Modeling for Deep Retrieval Models

Daniel Cohen; Bhaskar Mitra; Oleg Lesota; Navid Rekabsaz; Carsten; Eickhoff

arXiv:2105.04651·cs.IR·May 12, 2021

Not All Relevance Scores are Equal: Efficient Uncertainty and Calibration Modeling for Deep Retrieval Models

Daniel Cohen, Bhaskar Mitra, Oleg Lesota, Navid Rekabsaz, Carsten, Eickhoff

PDF

1 Repo

TL;DR

This paper introduces an efficient Bayesian approach to quantify uncertainty in deep retrieval models' relevance scores, enhancing ranking effectiveness and calibration for downstream tasks.

Contribution

It proposes a novel, computationally efficient Bayesian framework to model uncertainty in retrieval scores, improving calibration and downstream task performance.

Findings

01

Significantly improves ranking effectiveness via risk-aware reranking.

02

Enhances confidence calibration of retrieval models.

03

Uncertainty information is reliable and actionable for downstream tasks.

Abstract

In any ranking system, the retrieval model outputs a single score for a document based on its belief on how relevant it is to a given search query. While retrieval models have continued to improve with the introduction of increasingly complex architectures, few works have investigated a retrieval model's belief in the score beyond the scope of a single value. We argue that capturing the model's uncertainty with respect to its own scoring of a document is a critical aspect of retrieval that allows for greater use of current models across new document distributions, collections, or even improving effectiveness for down-stream tasks. In this paper, we address this problem via an efficient Bayesian framework for retrieval models which captures the model's belief in the relevance score through a stochastic process while adding only negligible computational overhead. We evaluate this belief…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dscohen/LastLayersBayesianIR
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttentive Walk-Aggregating Graph Neural Network