Fairness Is Not Just Ethical: Performance Trade-Off via Data Correlation Tuning to Mitigate Bias in ML Software
Ying Xiao, Shangwen Wang, Sicen Liu, Dingyuan Xue, Xian Zhan, Yepang Liu, Jie M. Zhang

TL;DR
This paper introduces Correlation Tuning (CoT), a pre-processing bias mitigation method that adjusts data correlations to improve fairness and performance in machine learning models, outperforming existing techniques.
Contribution
The paper proposes CoT, a novel pre-processing approach using correlation tuning and multi-objective optimization to effectively mitigate bias and enhance fairness in ML models.
Findings
Increases true positive rate for unprivileged groups by 17.5%.
Reduces bias metrics (SPD, AOD, EOD) by over 50%.
Outperforms state-of-the-art bias mitigation methods by 3-10 percentage points.
Abstract
Traditional software fairness research typically emphasizes ethical and social imperatives, neglecting that fairness fundamentally represents a core software quality issue arising directly from performance disparities across sensitive user groups. Recognizing fairness explicitly as a software quality dimension yields practical benefits beyond ethical considerations, notably improved predictive performance for unprivileged groups, enhanced out-of-distribution generalization, and increased geographic transferability in real-world deployments. Nevertheless, existing bias mitigation methods face a critical dilemma: while pre-processing methods offer broad applicability across model types, they generally fall short in effectiveness compared to post-processing techniques. To overcome this challenge, we propose Correlation Tuning (CoT), a novel pre-processing approach designed to mitigate bias…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Engineering Research · Mobile Crowdsensing and Crowdsourcing
