Bias Mitigation in Fine-tuning Pre-trained Models for Enhanced Fairness and Efficiency
Yixuan Zhang, Feng Zhou

TL;DR
This paper presents a novel bias mitigation framework for fine-tuning pre-trained models, improving fairness and efficiency by neutralizing influential weights and employing low-rank matrix factorization, validated through extensive experiments.
Contribution
It introduces a transfer learning strategy that neutralizes demographic-influential weights and combines it with low-rank matrix factorization to enhance fairness and reduce computational costs.
Findings
Effective bias mitigation across multiple models and tasks
Reduced computational complexity with low-rank approximation
Improved fairness metrics in fine-tuned models
Abstract
Fine-tuning pre-trained models is a widely employed technique in numerous real-world applications. However, fine-tuning these models on new tasks can lead to unfair outcomes. This is due to the absence of generalization guarantees for fairness properties, regardless of whether the original pre-trained model was developed with fairness considerations. To tackle this issue, we introduce an efficient and robust fine-tuning framework specifically designed to mitigate biases in new tasks. Our empirical analysis shows that the parameters in the pre-trained model that affect predictions for different demographic groups are different, so based on this observation, we employ a transfer learning strategy that neutralizes the importance of these influential weights, determined using Fisher information across demographic groups. Additionally, we integrate this weight importance neutralization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSafety Systems Engineering in Autonomy
