ROBBIE: Robust Bias Evaluation of Large Generative Language Models

David Esiobu; Xiaoqing Tan; Saghar Hosseini; Megan Ung; Yuchen Zhang,; Jude Fernandes; Jane Dwivedi-Yu; Eleonora Presani; Adina Williams; Eric; Michael Smith

arXiv:2311.18140·cs.CL·December 1, 2023·2 cites

ROBBIE: Robust Bias Evaluation of Large Generative Language Models

David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang,, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, Eric, Michael Smith

PDF

Open Access

TL;DR

ROBBIE provides a comprehensive framework for evaluating and mitigating social biases in large language models using diverse datasets and metrics, offering insights into bias sources and mitigation effectiveness.

Contribution

This work introduces new bias and toxicity benchmarks, compares multiple models across various demographic axes, and evaluates mitigation techniques, advancing bias measurement and reduction in LLMs.

Findings

01

AdvPromptSet and HolisticBiasR are effective new bias benchmarks.

02

Bias varies significantly across models and demographic axes.

03

Mitigation techniques show mixed success depending on the metric and bias type.

Abstract

As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensure equal and equitable treatment of marginalized demographic groups. In this work, our focus is two-fold: (1) Benchmarking: a comparison of 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs. Out of those 6 metrics, AdvPromptSet and HolisticBiasR are novel datasets proposed in the paper. The comparison of those benchmarks gives us insights about the bias and toxicity of the compared models. Therefore, we explore the frequency of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods

MethodsFocus