Mitigating the Position Bias of Transformer Models in Passage Re-Ranking

Sebastian Hofst\"atter; Aldo Lipani; Sophia Althammer; Markus; Zlabinger; Allan Hanbury

arXiv:2101.06980·cs.IR·January 19, 2021

Mitigating the Position Bias of Transformer Models in Passage Re-Ranking

Sebastian Hofst\"atter, Aldo Lipani, Sophia Althammer, Markus, Zlabinger, Allan Hanbury

PDF

Open Access 1 Repo

TL;DR

This paper identifies and addresses position bias in passage re-ranking datasets, proposing a debiasing method that improves the robustness and transferability of Transformer-based models across biased and unbiased datasets.

Contribution

It introduces a novel debiasing technique for passage re-ranking datasets that reduces position bias and enhances model generalization and transfer learning capabilities.

Findings

01

Debiasing improves model performance on unbiased datasets

02

Mitigating position bias enhances transfer learning between datasets

03

Models trained on debiased data perform consistently across biased and unbiased datasets

Abstract

Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset. When we search for a relevant piece of information it may appear anywhere in a given passage. However, we observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-ranking. The excessive favoring of earlier positions inside passages is an unwanted artefact. This leads to three common Transformer-based re-ranking models to ignore relevant parts in unseen passages. More concerningly, as the evaluation set is taken from the same biased distribution, the models overfitting to that bias overestimate their true effectiveness. In this work we analyze position bias on datasets, the contextualized representations, and their effect on retrieval results. We propose a debiasing method for retrieval datasets. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sebastian-hofstaetter/transformer-kernel-ranking
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning