Local Term Weight Models from Power Transformations: Development of   BM25IR: A Best Match Model based on Inverse Regression

Edel Garcia

arXiv:1608.01573·cs.IR·August 5, 2016

Local Term Weight Models from Power Transformations: Development of BM25IR: A Best Match Model based on Inverse Regression

Edel Garcia

PDF

Open Access

TL;DR

This paper introduces BM25IR, a new local term weighting model derived from power transformations and inverse regression, demonstrating its effectiveness across various conditions and document lengths.

Contribution

It develops BM25IR, a novel term weight model based on inverse regression and power transformations, unifying and extending existing models like BM25.

Findings

01

BM25IR performs well across different BM25 parameters

02

Inverse regression highlights the second occurrence of terms as significant

03

Power transformations provide a unified framework for local term weights

Abstract

In this article we show how power transformations can be used as a common framework for the derivation of local term weights. We found that under some parametric conditions, BM25 and inverse regression produce equivalent results. As a special case of inverse regression, we show that the largest increment in term weight occurs when a term is mentioned for the second time. A model based on inverse regression (BM25IR) is presented. Simulations suggest that BM25IR works fairly well for different BM25 parametric conditions and document lengths.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies