Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models

Yue Xu; Chengyan Fu; Li Xiong; Sibei Yang; Wenjie Wang

arXiv:2502.11559·cs.CL·November 4, 2025

Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models

Yue Xu, Chengyan Fu, Li Xiong, Sibei Yang, Wenjie Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces FaIRMaker, an automated, model-independent framework that uses auto-search and refinement to generate instructions called Fairwords, effectively reducing gender bias in large language models without sacrificing task performance.

Contribution

FaIRMaker is a novel automated framework that adaptively generates and refines Fairwords to mitigate gender bias in LLMs, overcoming limitations of existing bias mitigation methods.

Findings

01

Effectively reduces gender bias in LLM responses.

02

Preserves task performance while mitigating bias.

03

Compatible with both API-based and open-source models.

Abstract

Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias. While parameter-modification methods like fine-tuning mitigate bias, they are resource-intensive, unsuitable for closed-source models, and lack adaptability to evolving societal norms. Instruction-based approaches offer flexibility but often compromise task performance. To address these limitations, we propose $FaIRMaker$ , an automated and model-independent framework that employs an $auto-search and refinement$ paradigm to adaptively generate Fairwords, which act as instructions integrated into input queries to reduce gender bias and enhance response quality. Extensive experiments demonstrate that FaIRMaker automatically searches for and dynamically refines Fairwords, effectively mitigating gender…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models· slideslive

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection