Local Term Weight Models from Power Transformations: Development of BM25IR: A Best Match Model based on Inverse Regression
Edel Garcia

TL;DR
This paper introduces BM25IR, a new local term weighting model derived from power transformations and inverse regression, demonstrating its effectiveness across various conditions and document lengths.
Contribution
It develops BM25IR, a novel term weight model based on inverse regression and power transformations, unifying and extending existing models like BM25.
Findings
BM25IR performs well across different BM25 parameters
Inverse regression highlights the second occurrence of terms as significant
Power transformations provide a unified framework for local term weights
Abstract
In this article we show how power transformations can be used as a common framework for the derivation of local term weights. We found that under some parametric conditions, BM25 and inverse regression produce equivalent results. As a special case of inverse regression, we show that the largest increment in term weight occurs when a term is mentioned for the second time. A model based on inverse regression (BM25IR) is presented. Simulations suggest that BM25IR works fairly well for different BM25 parametric conditions and document lengths.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
