Improving Robustness with Adaptive Weight Decay

Amin Ghiasi; Ali Shafahi; Reza Ardekani

arXiv:2210.00094·cs.LG·December 5, 2023

Improving Robustness with Adaptive Weight Decay

Amin Ghiasi, Ali Shafahi, Reza Ardekani

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces adaptive weight decay, which dynamically adjusts the regularization hyper-parameter during training, leading to significant improvements in adversarial robustness and reduced overfitting across various datasets and models.

Contribution

The authors propose a novel adaptive weight decay method that tunes the hyper-parameter on-the-fly based on gradient and weight norms, enhancing robustness and reducing overfitting.

Findings

01

20% relative robustness improvement on CIFAR-100

02

10% relative robustness improvement on CIFAR-10

03

Less sensitivity to learning rate and smaller weight norms

Abstract

We propose adaptive weight decay, which automatically tunes the hyper-parameter for weight decay during each training iteration. For classification problems, we propose changing the value of the weight decay hyper-parameter on the fly based on the strength of updates from the classification loss (i.e., gradient of cross-entropy), and the regularization loss (i.e., $ℓ_{2}$ -norm of the weights). We show that this simple modification can result in large improvements in adversarial robustness -- an area which suffers from robust overfitting -- without requiring extra data across various datasets and architecture choices. For example, our reformulation results in $20%$ relative robustness improvement for CIFAR-100, and $10%$ relative robustness improvement on CIFAR-10 comparing to the best tuned hyper-parameters of traditional weight decay resulting in models that have comparable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hed-ucas/alphadecay
pytorch

Videos

Improving Robustness with Adaptive Weight Decay· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications

MethodsWeight Decay