Mitigating Gender Bias in Code Large Language Models via Model Editing

Zhanyue Qin; Haochuan Wang; Zecheng Wang; Deyuan Liu; Cunhang Fan,; Zhao Lv; Zhiying Tu; Dianhui Chu; Dianbo Sui

arXiv:2410.07820·cs.SE·October 11, 2024

Mitigating Gender Bias in Code Large Language Models via Model Editing

Zhanyue Qin, Haochuan Wang, Zecheng Wang, Deyuan Liu, Cunhang Fan,, Zhao Lv, Zhiying Tu, Dianhui Chu, Dianbo Sui

PDF

Open Access

TL;DR

This paper introduces a dataset and metric to evaluate gender bias in code LLMs, and proposes MG-Editing, a model editing method at multiple granularities, to mitigate bias while preserving code generation performance.

Contribution

It develops a novel model editing approach, MG-Editing, that effectively reduces gender bias in code LLMs at various parameter granularities without sacrificing performance.

Findings

01

MG-Editing significantly reduces gender bias in code LLMs.

02

Applying MG-Editing at row and neuron levels yields the best bias mitigation and performance balance.

03

The proposed methods generalize well across different models and bias scenarios.

Abstract

In recent years, with the maturation of large language model (LLM) technology and the emergence of high-quality programming code datasets, researchers have become increasingly confident in addressing the challenges of program synthesis automatically. However, since most of the training samples for LLMs are unscreened, it is inevitable that LLMs' performance may not align with real-world scenarios, leading to the presence of social bias. To evaluate and quantify the gender bias in code LLMs, we propose a dataset named CodeGenBias (Gender Bias in the Code Generation) and an evaluation metric called FB-Score (Factual Bias Score) based on the actual gender distribution of correlative professions. With the help of CodeGenBias and FB-Score, we evaluate and analyze the gender bias in eight mainstream Code LLMs. Previous work has demonstrated that model editing methods that perform well in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Cancer-related gene regulation

MethodsALIGN