Adaptive Distributional Extensions to DFR Ranking
Casper Petersen, Jakob Grue Simonsen, Kalervo Jarvelin and, Christina Lioma

TL;DR
This paper introduces Adaptive Distributional Ranking (ADR), a novel extension to DFR ranking that dynamically fits the distribution of non-informative terms in a dataset, improving ranking effectiveness.
Contribution
The paper proposes a method to automatically identify the best-fitting distribution for non-informative terms and adapt the DFR ranking accordingly, enhancing its empirical performance.
Findings
ADR outperforms traditional DFR models on TREC datasets.
ADR achieves performance comparable to query likelihood language models.
Adaptive fitting improves the discrimination between informative and non-informative terms.
Abstract
Divergence From Randomness (DFR) ranking models assume that informative terms are distributed in a corpus differently than non-informative terms. Different statistical models (e.g. Poisson, geometric) are used to model the distribution of non-informative terms, producing different DFR models. An informative term is then detected by measuring the divergence of its distribution from the distribution of non-informative terms. However, there is little empirical evidence that the distributions of non-informative terms used in DFR actually fit current datasets. Practically this risks providing a poor separation between informative and non-informative terms, thus compromising the discriminative power of the ranking model. We present a novel extension to DFR, which first detects the best-fitting distribution of non-informative terms in a collection, and then adapts the ranking computation to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
