Alleviating the Effect of Data Imbalance on Adversarial Training
Guanlin Li, Guowen Xu, Tianwei Zhang

TL;DR
This paper addresses the challenge of data imbalance in adversarial training by proposing a re-balancing framework that improves robustness and accuracy on long-tailed datasets.
Contribution
It introduces REAT, a novel adversarial training framework with a new strategy and penalty function to handle long-tailed data distributions.
Findings
REAT enhances robustness on imbalanced datasets.
It maintains high clean accuracy.
The framework is effective across various datasets and models.
Abstract
In this paper, we study adversarial training on datasets that obey the long-tailed distribution, which is practical but rarely explored in previous works. Compared with conventional adversarial training on balanced datasets, this process falls into the dilemma of generating uneven adversarial examples (AEs) and an unbalanced feature embedding space, causing the resulting model to exhibit low robustness and accuracy on tail data. To combat that, we theoretically analyze the lower bound of the robust risk to train a model on a long-tailed dataset to obtain the key challenges in addressing the aforementioned dilemmas. Based on it, we propose a new adversarial training framework -- Re-balancing Adversarial Training (REAT). This framework consists of two components: (1) a new training strategy inspired by the effective number to guide the model to generate more balanced and informative AEs;…
Peer Reviews
Decision·ICLR 2024 Conference Withdrawn Submission
The authors show that the head class dominates the AE label and feature embedding space, leading to the underfitting of the tail class. The author proposed RBL and TAIL corresponding to the maximization and minimization processes respectively to jointly solve this problem. This paper is well-written and easy to understand, and the experimental results are extensive.
Fig. 3 shows the effectiveness of RBL is not signifient, could introducing a hyperparameter further accentuate the weight disparities to enhance robustness? The TAIL only improve the TC’s feature space, could TAIL be tuned to align the feature space for all classes, e.g. class-dependent maximizing the KL divergence between each class and the first head class? The performance improvement of the REAT framework is relatively weak, using 25\% additional computing overhead to improve performance by
The paper is clearly written and the proposed methods seems reasonable.
I have the following concerns: 1. The performance improvement is very marginal compared to the original AT. 2. There is no evidence provided to show why the tail feature alignment is necessary. For example, whether this will cause performance compromise is not clear. 3. Similar studies are investigated previously, i.e., the paper [1]. [1] Imbalanced adversarial training with reweighting, Wang et al, 2022.
1. This paper focuses on an important but somewhat overlooked problem in adversarial training (AT): the imbalanced dataset setting. 2. This paper is clearly written and well-organized. The first two sections detail the introduction of related work and the motivation of the proposed method. 3. The motivations of the proposed method, observations on the prediction distribution of adversarial examples generation during AT, and feature embeddings are clear and supported by empirical validation and t
1. Adversarial training is known to result in a robust fairness issue. This means that even in a balanced dataset, the robustness of different classes varies significantly, which can be seen as an inherent data imbalance effect. Therefore, it would be beneficial to compare the proposed methods [2, 3] that address this issue (or at least mention them in the related work). 2. The improvement of the proposed method appears to be limited compared to RoBal [1]. While this is not a major concern, it w
1. The domain this paper is tackling is an important one. The motivation of the two main contributions of this paper (re-weighting and TAIL) are clear. In particular, the inclusion of effective number to handle AE generation is unique and well formulated. 2. The empirical studies (in particular the ablation studies) display the efficacy of proposed method very well.
1. The authors provide experimental results that show the efficacy of the proposed method. However, the improvements are not consistent across the board. For example, the clean accuracy of REAT always seems to be lower than the baselines at UR = 10. Similarly the adversarial accuracy of REAT is not always better and in many cases the improvements are marginal. The experiments are also limited to the CIFAR-10-LT and CIFAR-100-LT dataset. Experimenting with further datasets would have been helpful
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification
