OUIDecay: Adaptive Layer-wise Weight Decay for CNNs Using Online Activation Patterns

Alberto Fern\'andez-Hern\'andez; Jose I. Mestre; Cristian P\'erez-Corral; Manuel F. Dolz; Jose Duato; and Enrique S. Quintana-Ort\'i

arXiv:2605.10161·cs.LG·May 12, 2026

OUIDecay: Adaptive Layer-wise Weight Decay for CNNs Using Online Activation Patterns

Alberto Fern\'andez-Hern\'andez, Jose I. Mestre, Cristian P\'erez-Corral, Manuel F. Dolz, Jose Duato, and Enrique S. Quintana-Ort\'i

PDF

TL;DR

OUIDecay introduces an adaptive, layer-wise weight decay method for CNNs that uses online activation patterns to improve regularization without validation data, outperforming fixed decay in various benchmarks.

Contribution

The paper presents OUIDecay, a novel activation-based, online, layer-wise weight decay scheduler that adapts during training without validation data, enhancing CNN regularization.

Findings

01

OUIDecay achieves the best mean validation loss in 7 out of 8 settings.

02

Activation-driven decay outperforms fixed and gradient-based methods.

03

The approach is lightweight and suitable for online training.

Abstract

Weight decay remains one of the most widely used regularization mechanisms for training convolutional neural networks, yet it is still commonly applied as a fixed coefficient shared by all layers throughout training. This uniform treatment ignores that different layers may follow different structural dynamics and therefore may require different regularization strengths. In this work, we propose OUIDecay, an adaptive layer-wise and time-dependent weight decay scheduler for CNNs driven by the Overfitting-Underfitting Indicator (OUI), an activation-based metric previously shown to provide early information about regularization quality. OUIDecay uses a lightweight batch-based formulation of OUI to monitor the structural behavior of each layer online and periodically rescales its weight decay relative to the other layers in the network. Unlike gradient-based adaptive decay methods, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.