Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer
Mahdi Ghorbani, Fahimeh Fooladgar, Shohreh Kasaei

TL;DR
This paper introduces a multi-branched adversarial knowledge transfer method that enhances lightweight neural networks' performance through self-distillation and adversarial training, improving accuracy without extra inference costs.
Contribution
It proposes a novel ensemble self-distillation approach with adversarial learning for compact models, outperforming existing methods in accuracy and efficiency.
Findings
Outperforms primary models in accuracy at same parameter count
Effective in both image classification and encoder-decoder architectures
Achieves significant improvements over previous self-distillation techniques
Abstract
Deep neural network architectures have attained remarkable improvements in scene understanding tasks. Utilizing an efficient model is one of the most important constraints for limited-resource devices. Recently, several compression methods have been proposed to diminish the heavy computational burden and memory consumption. Among them, the pruning and quantizing methods exhibit a critical drop in performances by compressing the model parameters. While the knowledge distillation methods improve the performance of compact models by focusing on training lightweight networks with the supervision of cumbersome networks. In the proposed method, the knowledge distillation has been performed within the network by constructing multiple branches over the primary stream of the model, known as the self-distillation method. Therefore, the ensemble of sub-neural network models has been proposed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsPruning · Knowledge Distillation
