Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives
Si Sun, Chenyan Xiong, Yue Yu, Arnold Overwijk, Zhiyuan Liu, Jie Bao

TL;DR
This paper introduces ANCE-Tele, a method that reduces catastrophic forgetting in dense retrieval training by using momentum and lookahead negatives, leading to more stable training and better performance.
Contribution
The paper proposes ANCE-Tele, a novel negative sampling technique that mitigates catastrophic forgetting and improves dense retrieval training stability and effectiveness.
Findings
ANCE-Tele outperforms previous state-of-the-art systems.
It eliminates the dependency on sparse negatives.
It is competitive with larger models (50x parameters).
Abstract
In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. We show the catastrophic forgetting phenomena behind the training instability, where models learn and forget different negative groups during training iterations. We then propose ANCE-Tele, which accumulates momentum negatives from past iterations and approximates future iterations using lookahead negatives, as "teleportations" along the time axis to smooth the learning process. On web search and OpenQA, ANCE-Tele outperforms previous state-of-the-art systems of similar size, eliminates the dependency on sparse retrieval negatives, and is competitive among systems using significantly more (50x) parameters. Our analysis demonstrates that teleportation negatives reduce catastrophic forgetting and improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗OpenMatch/ance-tele_msmarco_qry-psg-encodermodel· 1 dl1 dl
- 🤗OpenMatch/ance-tele_nq_qry-encodermodel· 1 dl1 dl
- 🤗OpenMatch/ance-tele_nq_psg-encodermodel· 1 dl1 dl
- 🤗OpenMatch/ance-tele_triviaqa_qry-encodermodel· 2 dl2 dl
- 🤗OpenMatch/ance-tele_triviaqa_psg-encodermodel· 1 dl1 dl
- 🤗OpenMatch/dpr_bert-base_msmarco_qry-psg-encodermodel· 3 dl3 dl
- 🤗OpenMatch/ance-tele_coco-base_msmarco_qry-psg-encodermodel· 3 dl3 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
