Do the Right Thing, Just Debias! Multi-Category Bias Mitigation Using   LLMs

Amartya Roy; Danush Khanna; Devanshu Mahapatra; Vasanthakumar; Avirup; Das; Kripabandhu Ghosh

arXiv:2409.16371·cs.CL·September 26, 2024

Do the Right Thing, Just Debias! Multi-Category Bias Mitigation Using LLMs

Amartya Roy, Danush Khanna, Devanshu Mahapatra, Vasanthakumar, Avirup, Das, Kripabandhu Ghosh

PDF

Open Access

TL;DR

This paper introduces ANUBIS, a new dataset for nine social bias categories, and evaluates advanced LLM techniques for multi-category bias mitigation, emphasizing generalizability and societal impact.

Contribution

The paper presents ANUBIS, a comprehensive bias dataset, and assesses multiple LLM-based methods for effective multi-category bias reduction in language models.

Findings

01

ANUBIS contains 1507 curated sentence pairs across nine bias categories.

02

State-of-the-art models like T5 show varying effectiveness in bias mitigation.

03

Cross-dataset generalizability and environmental impacts are analyzed.

Abstract

This paper tackles the challenge of building robust and generalizable bias mitigation models for language. Recognizing the limitations of existing datasets, we introduce ANUBIS, a novel dataset with 1507 carefully curated sentence pairs encompassing nine social bias categories. We evaluate state-of-the-art models like T5, utilizing Supervised Fine-Tuning (SFT), Reinforcement Learning (PPO, DPO), and In-Context Learning (ICL) for effective bias mitigation. Our analysis focuses on multi-class social bias reduction, cross-dataset generalizability, and environmental impact of the trained models. ANUBIS and our findings offer valuable resources for building more equitable AI systems and contribute to the development of responsible and unbiased technologies with broad societal impact.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsAttention Is All You Need · Byte Pair Encoding · Linear Layer · Adafactor · Gated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · SentencePiece · Softmax · Layer Normalization · Inverse Square Root Schedule