Adversarial Bias: Data Poisoning Attacks on Fairness

Eunice Chan; Hanghang Tong

arXiv:2511.08331·cs.LG·November 12, 2025

Adversarial Bias: Data Poisoning Attacks on Fairness

Eunice Chan, Hanghang Tong

PDF

Open Access

TL;DR

This paper reveals how simple adversarial data poisoning can severely compromise the fairness of machine learning models, demonstrating a practical attack method that outperforms existing techniques across various datasets and models.

Contribution

It introduces a theoretically grounded adversarial poisoning strategy that effectively induces unfairness in classifiers, highlighting a new vulnerability in AI fairness.

Findings

01

The attack significantly degrades fairness metrics across multiple models.

02

The method outperforms existing fairness attack techniques.

03

It maintains high model accuracy while increasing unfairness.

Abstract

With the growing adoption of AI and machine learning systems in real-world applications, ensuring their fairness has become increasingly critical. The majority of the work in algorithmic fairness focus on assessing and improving the fairness of machine learning systems. There is relatively little research on fairness vulnerability, i.e., how an AI system's fairness can be intentionally compromised. In this work, we first provide a theoretical analysis demonstrating that a simple adversarial poisoning strategy is sufficient to induce maximally unfair behavior in naive Bayes classifiers. Our key idea is to strategically inject a small fraction of carefully crafted adversarial data points into the training set, biasing the model's decision boundary to disproportionately affect a protected group while preserving generalizable performance. To illustrate the practical effectiveness of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)