Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning
Chris Lengerich, Gabriel Synnaeve, Amy Zhang, Hugh Leather, Kurt, Shuster, Fran\c{c}ois Charton, Charysse Redwood

TL;DR
This paper introduces contrastive distillation, a self-supervised loss policy that enhances transfer learning efficiency by leveraging high mutual information between source and target tasks, outperforming traditional methods.
Contribution
The paper proposes a novel contrastive distillation method that improves transfer learning by efficiently sampling negative examples and capturing high mutual information.
Findings
Outperforms common transfer learning methods
Enables more efficient sampling of negative examples
Facilitates rapid adaptation in diverse subspaces
Abstract
Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization. While some approaches have been adopted to refine representations via auxiliary self-supervised losses while simultaneously learning decision policies, learning compositional representations from hand-designed and context-independent self-supervised losses (multi-view) still adapts relatively slowly to the real world, which contains many non-IID subspaces requiring rapid distribution shift in both time and spatial attention patterns at varying levels of abstraction. In contrast, supervised language model cascades have shown the flexibility to adapt to many diverse manifolds, and hints of self-learning needed for autonomous task transfer. However, to date, transfer methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Speech and dialogue systems
MethodsSelf-Learning
