iMacHSR: Intermediate Multi-Access Heterogeneous Supervision and Regularization Scheme Toward Architecture-Agnostic Training
Wei-Bin Kou, Guangxu Zhu, Yichen Jin, Bingyang Cheng, Shuai Wang, Ming Tang, Yik-Chung Wu

TL;DR
The paper introduces iMacHSR, a novel architecture-agnostic training scheme that employs intermediate supervision and regularization to improve deep learning model generalization and performance across various architectures.
Contribution
It proposes a new intermediate supervision and regularization scheme that is architecture-agnostic, guiding intermediate layers with diverse losses and regularization to enhance model generalization.
Findings
Outperforms traditional supervision methods by up to 9.19% in mIoU.
Effective across multiple model architectures.
Improves generalization and reduces overfitting.
Abstract
While deep supervision is a powerful training strategy by supervising intermediate layers with auxiliary losses, it faces three underexplored problems: (I) Existing deep supervision techniques are generally bond with specific model architectures strictly, lacking generality. (II) The identical loss function for intermediate and output layers causes intermediate layers to prioritize output-specific features prematurely, limiting generalizable representations. (III) Lacking regularization on hidden activations risks overconfident predictions, reducing generalization to unseen scenarios. To tackle these challenges, we propose an architecture-agnostic, intermediate Multi-access Heterogeneous Supervision and Regularization (iMacHSR) scheme. Specifically, the proposed iMacHSR introduces below integral strategies: (I) we select multiple intermediate layers based on predefined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks
