Enhancing Robust Fairness via Confusional Spectral Regularization
Gaojie Jin, Sihao Wu, Jiaxu Liu, Tianjin Huang, and Ronghui Mu

TL;DR
This paper introduces a spectral regularization method within the PAC-Bayesian framework to improve worst-class robust accuracy and fairness in deep neural networks, addressing divergence issues between training and testing robustness.
Contribution
It derives a new robust generalization bound and proposes a novel spectral norm regularization technique for robust fairness in DNNs.
Findings
Regularization improves worst-class robust accuracy.
Method enhances robust fairness across datasets.
Experimental results confirm effectiveness of the approach.
Abstract
Recent research has highlighted a critical issue known as ``robust fairness", where robust accuracy varies significantly across different classes, undermining the reliability of deep neural networks (DNNs). A common approach to address this has been to dynamically reweight classes during training, giving more weight to those with lower empirical robust performance. However, we find there is a divergence of class-wise robust performance between training set and testing set, which limits the effectiveness of these explicit reweighting methods, indicating the need for a principled alternative. In this work, we derive a robust generalization bound for the worst-class robust error within the PAC-Bayesian framework, accounting for unknown data distributions. Our analysis shows that the worst-class robust error is influenced by two main factors: the spectral norm of the empirical robust…
Peer Reviews
Decision·ICLR 2025 Poster
- The paper is well-written and easy to follow. It gives adequate background on the topics of fairness, adversarial robustness, and PAC-bound, which can be very helpful for readers who are not familiar with the field to understand the gist of the paper. - The contribution is quite significant because: 1. this is the first PAC-Bayesian framework to characterize the worst-class robust error across different classes. 2. the proposed regularization term is novel because it aims to improve the robust
- The motivation from the intro feels disconnected from the method. In Figure 2, the authors argue that the class exhibiting the worst robust performance on the training set may not be the same as the one on the test set. However, how would the proposed method in Section 4 address this issue? It would be great if the authors could provide some discussion about the connection between this motivation and the proposed method. - It would be great to also see some analysis for the hyper-parameters $
- The approximation in section 4.1 makes it possible to implement an adversarial spectral-norm regularizer effectively. - Based on limited empirical analysis, the method looks robust to the choice of hyperparameters. - The method performs well against SOTA attack methods (AutoAttack), and not just the limited setting considered in the theoretical analysis.
- Empirical analysis is somewhat limited both in domain and the datasets used. It’s hard to argue only based on the provided analysis that the results will extend to larger models, other vision datasets, or non-vision classification tasks. Adding experiments with much larger datasets (e.g. larger ImageNet) or non-vision tasks would greatly improve the empirical analysis in the paper.
1. This work represents the first endeavor to develop a PAC-Bayesian framework to characterize the worst-class robust error across different classes. This is an important problem. 2. This paper is theoretically solid. 2. The improvement brought by this method is obvious in the experiment.
1. The fairness improvement method in this article seems to be designed only for adversarial training, so is there any improvement for the unfairness brought by other settings (such as long-tail training distribution)? 2. In the experiment, the author targeted AutoAttack. Does it have any effect on other attack methods?
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
MethodsSparse Evolutionary Training
