Unifying Synergies between Self-supervised Learning and Dynamic   Computation

Tarun Krishna; Ayush K Rai; Alexandru Drimbarean; Eric Arazo; Paul; Albert; Alan F Smeaton; Kevin McGuinness; Noel E O'Connor

arXiv:2301.09164·cs.LG·September 12, 2023

Unifying Synergies between Self-supervised Learning and Dynamic Computation

Tarun Krishna, Ayush K Rai, Alexandru Drimbarean, Eric Arazo, Paul, Albert, Alan F Smeaton, Kevin McGuinness, Noel E O'Connor

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel SSL training method that simultaneously learns dense and gated sub-networks from scratch, reducing computational costs while maintaining performance across various image classification benchmarks.

Contribution

It presents a new approach to co-evolve dense and gated networks during SSL pre-training, eliminating the need for fine-tuning or pruning for lightweight models.

Findings

01

Achieves comparable accuracy with reduced FLOPs.

02

Demonstrates effectiveness across CIFAR-10/100, STL-10, and ImageNet-100.

03

Provides a versatile architecture for resource-constrained industrial applications.

Abstract

Computationally expensive training strategies make self-supervised learning (SSL) impractical for resource constrained industrial settings. Techniques like knowledge distillation (KD), dynamic computation (DC), and pruning are often used to obtain a lightweightmodel, which usually involves multiple epochs of fine-tuning (or distilling steps) of a large pre-trained model, making it more computationally challenging. In this work we present a novel perspective on the interplay between SSL and DC paradigms. In particular, we show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting without any additional fine-tuning or pruning steps. The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off and therefore yields a generic and multi-purpose architecture for application specific industrial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KrishnaTarun/Unification
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Image Processing Techniques and Applications

MethodsPruning · Knowledge Distillation