Adam: Dense Retrieval Distillation with Adaptive Dark Examples
Chongyang Tao, Chang Liu, Tao Shen, Can Xu, Xiubo Geng, Binxing Jiao,, Daxin Jiang

TL;DR
This paper introduces ADAM, a novel knowledge distillation framework for dense retrieval that uses adaptive dark examples and self-paced learning to enhance the transfer of dark knowledge from teacher to student, improving retrieval performance.
Contribution
The paper proposes a new method for constructing dark examples and a self-paced distillation strategy to better transfer dark knowledge in dense retrieval models.
Findings
ADAM outperforms existing methods on benchmark datasets.
Dark examples with moderate relevance improve knowledge transfer.
Self-paced distillation enhances student model learning.
Abstract
To improve the performance of the dual-encoder retriever, one effective approach is knowledge distillation from the cross-encoder ranker. Existing works construct the candidate passages following the supervised learning setting where a query is paired with a positive passage and a batch of negatives. However, through empirical observation, we find that even the hard negatives from advanced methods are still too trivial for the teacher to distinguish, preventing the teacher from transferring abundant dark knowledge to the student through its soft label. To alleviate this issue, we propose ADAM, a knowledge distillation framework that can better transfer the dark knowledge held in the teacher with Adaptive Dark exAMples. Different from previous works that only rely on one positive and hard negatives as candidate passages, we create dark examples that all have moderate relevance to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
MethodsKnowledge Distillation
