Breaking Down Bias: On The Limits of Generalizable Pruning Strategies

Sibo Ma; Alejandro Salinas; Peter Henderson; Julian Nyarko

arXiv:2502.07771·cs.CL·February 12, 2025

Breaking Down Bias: On The Limits of Generalizable Pruning Strategies

Sibo Ma, Alejandro Salinas, Peter Henderson, Julian Nyarko

PDF

Open Access

TL;DR

This paper investigates the limits of pruning strategies to mitigate racial biases in large language models, revealing partial bias representation and context-specific challenges that limit generalizability.

Contribution

It demonstrates that neuron-based pruning can reduce bias effectively but faces significant limitations in generalizing across different contexts and bias types.

Findings

01

Pruning can reduce bias without increasing anomalous behavior.

02

Neuron-based pruning outperforms head-pruning in bias mitigation.

03

Generalization of pruning strategies across contexts is limited.

Abstract

We employ model pruning to examine how LLMs conceptualize racial biases, and whether a generalizable mitigation strategy for such biases appears feasible. Our analysis yields several novel insights. We find that pruning can be an effective method to reduce bias without significantly increasing anomalous model behavior. Neuron-based pruning strategies generally yield better results than approaches pruning entire attention heads. However, our results also show that the effectiveness of either approach quickly deteriorates as pruning strategies become more generalized. For instance, a model that is trained on removing racial biases in the context of financial decision-making poorly generalizes to biases in commercial transactions. Overall, our analysis suggests that racial biases are only partially represented as a general concept within language models. The other part of these biases is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making

MethodsSoftmax · Attention Is All You Need · Pruning