Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT
Jon Saad-Falcon, Daniel Y. Fu, Simran Arora, Neel Guha, Christopher, R\'e

TL;DR
This paper introduces LoCoV1, a new benchmark for long-context retrieval, and presents M2-BERT, an efficient state-space encoder that significantly outperforms existing models on long document retrieval tasks.
Contribution
The paper develops a novel benchmark for evaluating long-context retrieval and proposes M2-BERT, a scalable, efficient encoder capable of handling documents up to 32K tokens.
Findings
M2-BERT outperforms Transformer-based models by at least 23.3 points on LoCoV1.
M2-BERT achieves comparable performance with fewer parameters, up to 90x fewer.
The proposed pretraining and finetuning strategies enable effective long-context retrieval.
Abstract
Retrieval pipelines-an integral component of many machine learning systems-perform poorly in domains where documents are long (e.g., 10K tokens or more) and where identifying the relevant document requires synthesizing information across the entire text. Developing long-context retrieval encoders suitable for these domains raises three challenges: (1) how to evaluate long-context retrieval performance, (2) how to pretrain a base language model to represent both short contexts (corresponding to queries) and long contexts (corresponding to documents), and (3) how to fine-tune this model for retrieval under the batch size limitations imposed by GPU memory constraints. To address these challenges, we first introduce LoCoV1, a novel 12 task benchmark constructed to measure long-context retrieval where chunking is not possible or not effective. We next present the M2-BERT retrieval encoder,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Alibaba-NLP/gte-multilingual-basemodel· 914k dl· ♡ 353914k dl♡ 353
- 🤗hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1model· 16 dl· ♡ 416 dl♡ 4
- 🤗hazyresearch/M2-BERT-128-Retrieval-Encoder-V1model· 12 dl· ♡ 312 dl♡ 3
- 🤗hazyresearch/M2-BERT-2k-Retrieval-Encoder-V1model· 5 dl· ♡ 25 dl♡ 2
- 🤗hazyresearch/M2-BERT-8k-Retrieval-Encoder-V1model· 117 dl· ♡ 4117 dl♡ 4
- 🤗leeloolee/intentionmodel· 33 dl· ♡ 433 dl♡ 4
- 🤗atomic-canyon/fermi-bert-512model· 14 dl14 dl
- 🤗atomic-canyon/fermi-bert-1024model· 2 dl· ♡ 12 dl♡ 1
- 🤗Maxthemacaque/onnx-gte-multilingual-basemodel· ♡ 1♡ 1
- 🤗soprasteria/gte-multilingual-basemodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Recommender Systems and Techniques · Semantic Web and Ontologies
MethodsBalanced Selection
