ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models
Antoine Chaffin, Luca Arnaboldi, Am\'elie Chatelain, Florent Krzakala

TL;DR
This paper demonstrates that large-scale pre-training of multi-vector models like ColBERT significantly improves their performance, with fully pre-trained models outperforming models that rely on knowledge distillation and strong data.
Contribution
It shows the effectiveness of large-scale multi-vector pre-training for ColBERT models and explores training strategies to optimize performance without extensive unsupervised phases.
Findings
Fully pre-trained ColBERT-Zero outperforms state-of-the-art models.
Supervised pre-training reduces the need for costly unsupervised phases.
Aligning fine-tuning and pre-training setups is crucial for optimal results.
Abstract
Current state-of-the-art multi-vector models are obtained through a small Knowledge Distillation (KD) training step on top of strong single-vector models, leveraging the large-scale pre-training of these models. In this paper, we study the pre-training of multi-vector models and show that large-scale multi-vector pre-training yields much stronger multi-vector models. Notably, a fully ColBERT-pre-trained model, ColBERT-Zero, trained only on public data, outperforms GTE-ModernColBERT as well as its base model, GTE-ModernBERT, which leverages closed and much stronger data, setting new state-of-the-art for model this size. We also find that, although performing only a small KD step is not enough to achieve results close to full pre-training, adding a supervised step beforehand allows to achieve much closer performance while skipping the most costly unsupervised phase. Finally, we find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗lightonai/ColBERT-Zeromodel· 2.3k dl· ♡ 342.3k dl♡ 34
- 🤗lightonai/ColBERT-Zero-nopromptsmodel· 20 dl· ♡ 220 dl♡ 2
- 🤗lightonai/ColBERT-Zero-supervisedmodel· 53 dl· ♡ 353 dl♡ 3
- 🤗lightonai/ColBERT-Zero-supervised-nopromptsmodel· 7 dl7 dl
- 🤗lightonai/ColBERT-Zero-unsupervisedmodel· 141 dl· ♡ 2141 dl♡ 2
- 🤗lightonai/ColBERT-Zero-unsupervised-nopromptsmodel· 7 dl7 dl
- 🤗lightonai/ModernColBERT-embed-basemodel· 5 dl5 dl
- 🤗lightonai/ModernColBERT-embed-base-kd-onlymodel· 19 dl· ♡ 119 dl♡ 1
- 🤗lightonai/ModernColBERT-embed-base-supervisedmodel· 3 dl3 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Explainable Artificial Intelligence (XAI)
