PRIDE -- Parameter-Efficient Reduction of Identity Discrimination for Equality in LLMs

Maluna Menke; Thilo Hagendorff

arXiv:2507.13743·cs.CL·July 21, 2025

PRIDE -- Parameter-Efficient Reduction of Identity Discrimination for Equality in LLMs

Maluna Menke, Thilo Hagendorff

PDF

Open Access

TL;DR

This paper evaluates parameter-efficient fine-tuning methods, especially LoRA, to reduce gender and sexual identity biases in large language models, achieving significant fairness improvements with minimal additional parameters.

Contribution

It demonstrates that LoRA fine-tuning on a curated queer corpus effectively reduces bias in LLMs, offering a lightweight alternative to full-model fine-tuning.

Findings

01

LoRA reduces bias scores by up to 50 points.

02

Neutrality increases from near 0% to 36%.

03

Soft-prompt tuning shows marginal improvements.

Abstract

Large Language Models (LLMs) frequently reproduce the gender- and sexual-identity prejudices embedded in their training corpora, leading to outputs that marginalize LGBTQIA+ users. Hence, reducing such biases is of great importance. To achieve this, we evaluate two parameter-efficient fine-tuning (PEFT) techniques - Low-Rank Adaptation (LoRA) and soft-prompt tuning - as lightweight alternatives to full-model fine-tuning for mitigating such biases. Using the WinoQueer benchmark, we quantify bias in three open-source LLMs and observe baseline bias scores reaching up to 98 (out of 100) across a range of queer identities defined by gender and/or sexual orientation, where 50 would indicate neutrality. Fine-tuning with LoRA (< 0.1% additional parameters) on a curated QueerNews corpus reduces those scores by up to 50 points and raises neutrality from virtually 0% to as much as 36%. Soft-prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Hate Speech and Cyberbullying Detection · Topic Modeling