
TL;DR
This paper presents internal node bagging, a training method that explicitly groups nodes to learn features, improving small model performance while maintaining inference efficiency.
Contribution
It introduces a new training approach that explicitly groups nodes for feature learning, enhancing small model accuracy compared to dropout.
Findings
Outperforms dropout on benchmark datasets for small models.
Allows larger parameter utilization during training without increasing inference size.
Demonstrates significant accuracy improvements in experiments.
Abstract
We introduce a novel view to understand how dropout works as an inexplicit ensemble learning method, which doesn't point out how many and which nodes to learn a certain feature. We propose a new training method named internal node bagging, it explicitly forces a group of nodes to learn a certain feature in training time, and combine those nodes to be one node in inference time. It means we can use much more parameters to improve model's fitting ability in training time while keeping model small in inference time. We test our method on several benchmark datasets and find it performs significantly better than dropout on small models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
MethodsDropout
