Adaptive Depth Networks with Skippable Sub-Paths
Woochul Kang, Hyungseop Lee

TL;DR
This paper introduces a practical adaptive depth network approach with skippable sub-paths, enabling flexible accuracy-efficiency trade-offs without complex training, applicable to CNNs and transformers.
Contribution
It proposes a simple self-distillation training method for hierarchical residual stages, allowing dynamic sub-path skipping for resource-aware inference.
Findings
Effective in reducing inference latency
Maintains high accuracy with skipped sub-paths
Applicable to both CNNs and transformers
Abstract
Predictable adaptation of network depths can be an effective way to control inference latency and meet the resource condition of various devices. However, previous adaptive depth networks do not provide general principles and a formal explanation on why and which layers can be skipped, and, hence, their approaches are hard to be generalized and require long and complex training steps. In this paper, we present a practical approach to adaptive depth networks that is applicable to various networks with minimal training effort. In our approach, every hierarchical residual stage is divided into two sub-paths, and they are trained to acquire different properties through a simple self-distillation strategy. While the first sub-path is essential for hierarchical feature learning, the second one is trained to refine the learned features and minimize performance degradation if it is skipped.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Advanced Vision and Imaging
