Boosting Residual Networks with Group Knowledge
Shengji Tang, Peng Ye, Baopu Li, Weihao Lin, Tao Chen, Tong He, Chong, Yu, Wanli Ouyang

TL;DR
This paper introduces a group knowledge training framework that enhances residual networks by leveraging diverse subnet knowledge and hierarchical grouping, leading to improved performance and efficiency.
Contribution
It proposes a novel group knowledge-based training method that utilizes hierarchical subnet grouping and knowledge aggregation to boost residual network performance.
Findings
Achieves superior performance on multiple datasets.
Provides better efficiency compared to existing methods.
Effectively leverages subnet diversity for training improvement.
Abstract
Recent research understands the residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of the residual network by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this manuscript, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
MethodsStochastic Depth · Knowledge Distillation
