PerNodeDrop: A Method Balancing Specialized Subnets and Regularization in Deep Neural Networks
Gelesh G Omathil, Sreeja CS

TL;DR
PerNodeDrop introduces per-sample, per-node stochastic perturbations in neural networks to enhance regularization, balancing the preservation of useful co-adaptation with the reduction of overfitting, thereby improving generalization across multiple domains.
Contribution
It proposes a novel regularization method that applies input-specific noise at the node level, differing from existing layer-wide approaches, to better control co-adaptation and improve model robustness.
Findings
Improves generalization on vision, text, and audio benchmarks.
Reduces overfitting by attenuating excessive co-adaptation.
Narrowed the gap between training and validation performance.
Abstract
Deep neural networks possess strong representational capacity yet remain vulnerable to overfitting, primarily because neurons tend to co-adapt in ways that, while capturing complex and fine-grained feature interactions, also reinforce spurious and non-generalizable patterns that inflate training performance but reduce reliability on unseen data. Noise-based regularizers such as Dropout and DropConnect address this issue by injecting stochastic perturbations during training, but the noise they apply is typically uniform across a layer or across a batch of samples, which can suppress both harmful and beneficial co-adaptation. This work introduces PerNodeDrop, a lightweight stochastic regularization method. It applies per-sample, per-node perturbations to break the uniformity of the noise injected by existing techniques, thereby allowing each node to experience input-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
