Privacy-Preserving Debiasing using Data Augmentation and Machine Unlearning
Zhixin Pan, Emma Andrews, Laura Chang, Prabhat Mishra

TL;DR
This paper introduces a novel approach combining diffusion-based data augmentation and multi-shard machine unlearning to reduce bias and enhance privacy protection in machine learning models, demonstrating effectiveness across various datasets.
Contribution
It presents a new method that jointly addresses data bias and privacy concerns using diffusion-based augmentation and unlearning, with provable privacy guarantees.
Findings
Significant bias reduction achieved across datasets.
Enhanced robustness against privacy attacks.
Effective balance between fairness and privacy.
Abstract
Data augmentation is widely used to mitigate data bias in the training dataset. However, data augmentation exposes machine learning models to privacy attacks, such as membership inference attacks. In this paper, we propose an effective combination of data augmentation and machine unlearning, which can reduce data bias while providing a provable defense against known attacks. Specifically, we maintain the fairness of the trained model with diffusion-based data augmentation, and then utilize multi-shard unlearning to remove identifying information of original data from the ML model for protection against privacy attacks. Experimental evaluation across diverse datasets demonstrates that our approach can achieve significant improvements in bias reduction as well as robustness against state-of-the-art privacy attacks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
