KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder
Maheswar Bora, Saurabh Atreya, Aritra Mukherjee, Abhijit Das

TL;DR
KDC-MAE introduces a novel self-supervised learning architecture that combines contrastive learning, self-distillation, and masked data modeling to enhance multi-modal and multi-task representation learning.
Contribution
The paper proposes KDC-MAE, a new SSL framework that integrates multiple objectives with a novel masking strategy and weighted combination for improved joint learning.
Findings
Enhanced performance on multiple modalities and tasks.
Effective combination of contrastive, distillation, and masking objectives.
Demonstrated superiority over existing SSL methods.
Abstract
In this work, we attempted to extend the thought and showcase a way forward for the Self-supervised Learning (SSL) learning paradigm by combining contrastive learning, self-distillation (knowledge distillation) and masked data modelling, the three major SSL frameworks, to learn a joint and coordinated representation. The proposed technique of SSL learns by the collaborative power of different learning objectives of SSL. Hence to jointly learn the different SSL objectives we proposed a new SSL architecture KDC-MAE, a complementary masking strategy to learn the modular correspondence, and a weighted way to combine them coordinately. Experimental results conclude that the contrastive masking correspondence along with the KD learning objective has lent a hand to performing better learning for multiple modalities over multiple tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Numerical Analysis Techniques
