Bias Unveiled: Investigating Social Bias in LLM-Generated Code
Lin Ling, Fazle Rabbi, Song Wang, Jinqiu Yang

TL;DR
This paper introduces Solar, a framework to evaluate and reduce social biases in code generated by large language models, revealing significant biases and demonstrating effective mitigation strategies.
Contribution
The paper presents Solar, a novel fairness framework with a new bias dataset, and explores prompting strategies to significantly reduce social bias in LLM-generated code.
Findings
Severe social bias found in all tested LLMs' generated code.
Dialogue with Solar reduces social bias by up to 90%.
The framework and dataset are publicly available and extensible.
Abstract
Large language models (LLMs) have significantly advanced the field of automated code generation. However, a notable research gap exists in evaluating social biases that may be present in the code produced by LLMs. To solve this issue, we propose a novel fairness framework, i.e., Solar, to assess and mitigate the social biases of LLM-generated code. Specifically, Solar can automatically generate test cases for quantitatively uncovering social biases of the auto-generated code by LLMs. To quantify the severity of social biases in generated code, we develop a dataset that covers a diverse set of social problems. We applied Solar and the crafted dataset to four state-of-the-art LLMs for code generation. Our evaluation reveals severe bias in the LLM-generated code from all the subject LLMs. Furthermore, we explore several prompting strategies for mitigating bias, including Chain-of-Thought…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Law · Hate Speech and Cyberbullying Detection
