A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques
Jimmy Lin, Xueguang Ma

TL;DR
This paper introduces a conceptual framework for information retrieval techniques based on representation types and compares recent methods, proposing a new simple extension called uniCOIL that achieves state-of-the-art sparse retrieval results.
Contribution
It presents a unifying framework for IR techniques, analyzes existing methods, and introduces uniCOIL, a novel extension that improves sparse retrieval performance.
Findings
uniCOIL achieves state-of-the-art results on MS MARCO
Framework clarifies relationships between IR techniques
Identifies unexplored opportunities in IR methods
Abstract
Recent developments in representational learning for information retrieval can be organized in a conceptual framework that establishes two pairs of contrasts: sparse vs. dense representations and unsupervised vs. learned representations. Sparse learned representations can further be decomposed into expansion and term weighting components. This framework allows us to understand the relationship between recently proposed techniques such as DPR, ANCE, DeepCT, DeepImpact, and COIL, and furthermore, gaps revealed by our analysis point to "low hanging fruit" in terms of techniques that have yet to be explored. We present a novel technique dubbed "uniCOIL", a simple extension of COIL that achieves to our knowledge the current state-of-the-art in sparse retrieval on the popular MS MARCO passage ranking dataset. Our implementation using the Anserini IR toolkit is built on the Lucene search…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗BAAI/bge-m3model· 14.5M dl· ♡ 287114.5M dl♡ 2871
- 🤗BAAI/bge-m3-unsupervisedmodel· 6.0k dl· ♡ 186.0k dl♡ 18
- 🤗BAAI/bge-m3-retromaemodel· 1.6k dl· ♡ 181.6k dl♡ 18
- 🤗Enno-Ai/bge-m3model· 1 dl1 dl
- 🤗Ruddy0201/YOUR_MODEL_NAMEmodel· 4 dl4 dl
- 🤗dabitbol/bge-m3-sparse-elasticmodel· 2 dl· ♡ 22 dl♡ 2
- 🤗Bylaw/BAAI-bge-m3model· 3 dl3 dl
- 🤗comet24082002/bgeM3_MaxSq256_1024model· 4 dl4 dl
- 🤗abratnap/bge-m3model
- 🤗zkwang/bge-m3-hmodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Image Retrieval and Classification Techniques
