Enhancing Robust Fairness via Confusional Spectral Regularization

Gaojie Jin; Sihao Wu; Jiaxu Liu; Tianjin Huang; and Ronghui Mu

arXiv:2501.13273·cs.LG·January 24, 2025

Enhancing Robust Fairness via Confusional Spectral Regularization

Gaojie Jin, Sihao Wu, Jiaxu Liu, Tianjin Huang, and Ronghui Mu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a spectral regularization method within the PAC-Bayesian framework to improve worst-class robust accuracy and fairness in deep neural networks, addressing divergence issues between training and testing robustness.

Contribution

It derives a new robust generalization bound and proposes a novel spectral norm regularization technique for robust fairness in DNNs.

Findings

01

Regularization improves worst-class robust accuracy.

02

Method enhances robust fairness across datasets.

03

Experimental results confirm effectiveness of the approach.

Abstract

Recent research has highlighted a critical issue known as ``robust fairness", where robust accuracy varies significantly across different classes, undermining the reliability of deep neural networks (DNNs). A common approach to address this has been to dynamically reweight classes during training, giving more weight to those with lower empirical robust performance. However, we find there is a divergence of class-wise robust performance between training set and testing set, which limits the effectiveness of these explicit reweighting methods, indicating the need for a principled alternative. In this work, we derive a robust generalization bound for the worst-class robust error within the PAC-Bayesian framework, accounting for unknown data distributions. Our analysis shows that the worst-class robust error is influenced by two main factors: the spectral norm of the empirical robust…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 3

Strengths

- The paper is well-written and easy to follow. It gives adequate background on the topics of fairness, adversarial robustness, and PAC-bound, which can be very helpful for readers who are not familiar with the field to understand the gist of the paper. - The contribution is quite significant because: 1. this is the first PAC-Bayesian framework to characterize the worst-class robust error across different classes. 2. the proposed regularization term is novel because it aims to improve the robust

Weaknesses

- The motivation from the intro feels disconnected from the method. In Figure 2, the authors argue that the class exhibiting the worst robust performance on the training set may not be the same as the one on the test set. However, how would the proposed method in Section 4 address this issue? It would be great if the authors could provide some discussion about the connection between this motivation and the proposed method. - It would be great to also see some analysis for the hyper-parameters $

Reviewer 02Rating 6Confidence 4

Strengths

- The approximation in section 4.1 makes it possible to implement an adversarial spectral-norm regularizer effectively. - Based on limited empirical analysis, the method looks robust to the choice of hyperparameters. - The method performs well against SOTA attack methods (AutoAttack), and not just the limited setting considered in the theoretical analysis.

Weaknesses

- Empirical analysis is somewhat limited both in domain and the datasets used. It’s hard to argue only based on the provided analysis that the results will extend to larger models, other vision datasets, or non-vision classification tasks. Adding experiments with much larger datasets (e.g. larger ImageNet) or non-vision tasks would greatly improve the empirical analysis in the paper.

Reviewer 03Rating 6Confidence 1

Strengths

1. This work represents the first endeavor to develop a PAC-Bayesian framework to characterize the worst-class robust error across different classes. This is an important problem. 2. This paper is theoretically solid. 2. The improvement brought by this method is obvious in the experiment.

Weaknesses

1. The fairness improvement method in this article seems to be designed only for adversarial training, so is there any improvement for the unfairness brought by other settings (such as long-tail training distribution)? 2. In the experiment, the author targeted AutoAttack. Does it have any effect on other attack methods?

Code & Models

Repositories

Alexkael/CONFUSIONAL-SPECTRAL-REGULARIZATION
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsSparse Evolutionary Training