FAIR-TAT: Improving Model Fairness Using Targeted Adversarial Training

Tejaswini Medi; Steffen Jung; Margret Keuper

arXiv:2410.23142·cs.LG·June 17, 2025

FAIR-TAT: Improving Model Fairness Using Targeted Adversarial Training

Tejaswini Medi, Steffen Jung, Margret Keuper

PDF

Open Access

TL;DR

This paper introduces FAIR-TAT, a novel targeted adversarial training method that improves model fairness and robustness against adversarial attacks and corruptions, addressing fairness disparities in class-wise robustness.

Contribution

The paper proposes a targeted adversarial training approach that enhances fairness and robustness, outperforming traditional untargeted methods in adversarial settings.

Findings

01

Targeted adversarial training improves fairness trade-offs.

02

FAIR-TAT enhances robustness against diverse adversarial threats.

03

Empirical results show increased fairness and robustness in models.

Abstract

Deep neural networks are susceptible to adversarial attacks and common corruptions, which undermine their robustness. In order to enhance model resilience against such challenges, Adversarial Training (AT) has emerged as a prominent solution. Nevertheless, adversarial robustness is often attained at the expense of model fairness during AT, i.e., disparity in class-wise robustness of the model. While distinctive classes become more robust towards such adversaries, hard to detect classes suffer. Recently, research has focused on improving model fairness specifically for perturbed images, overlooking the accuracy of the most likely non-perturbed data. Additionally, despite their robustness against the adversaries encountered during model training, state-of-the-art adversarial trained models have difficulty maintaining robustness and fairness when confronted with diverse adversarial threats…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)