A Multi-LLM Debiasing Framework
Deonna M. Owens, Ryan A. Rossi, Sungchul Kim, Tong Yu, Franck, Dernoncourt, Xiang Chen, Ruiyi Zhang, Jiuxiang Gu, Hanieh Deilamsalehy, Nedim, Lipka

TL;DR
This paper introduces a novel multi-LLM framework for reducing biases in large language models, utilizing centralized and decentralized approaches, and demonstrates its effectiveness in outperforming existing methods across various social groups.
Contribution
It is the first to propose and evaluate a multi-LLM debiasing framework with centralized and decentralized strategies for bias mitigation.
Findings
Significant bias reduction achieved in LLMs.
Outperforms baseline methods across social groups.
Effective in both centralized and decentralized setups.
Abstract
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities. Despite significant advancements in bias mitigation techniques using data augmentation, zero-shot prompting, and model fine-tuning, biases continuously persist, including subtle biases that may elude human detection. Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning and factuality in LLMs. Building on this approach, we propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs. Our work is the first to introduce and evaluate two distinct approaches within this framework for debiasing LLMs: a centralized method, where the conversation is facilitated by a single central LLM, and a decentralized method,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Digital Rights Management and Security
