ROBBIE: Robust Bias Evaluation of Large Generative Language Models
David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang,, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, Eric, Michael Smith

TL;DR
ROBBIE provides a comprehensive framework for evaluating and mitigating social biases in large language models using diverse datasets and metrics, offering insights into bias sources and mitigation effectiveness.
Contribution
This work introduces new bias and toxicity benchmarks, compares multiple models across various demographic axes, and evaluates mitigation techniques, advancing bias measurement and reduction in LLMs.
Findings
AdvPromptSet and HolisticBiasR are effective new bias benchmarks.
Bias varies significantly across models and demographic axes.
Mitigation techniques show mixed success depending on the metric and bias type.
Abstract
As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensure equal and equitable treatment of marginalized demographic groups. In this work, our focus is two-fold: (1) Benchmarking: a comparison of 6 different prompt-based bias and toxicity metrics across 12 demographic axes and 5 families of generative LLMs. Out of those 6 metrics, AdvPromptSet and HolisticBiasR are novel datasets proposed in the paper. The comparison of those benchmarks gives us insights about the bias and toxicity of the compared models. Therefore, we explore the frequency of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods
MethodsFocus
