Why Don't Prompt-Based Fairness Metrics Correlate?

Abdelrahman Zayed; Goncalo Mordido; Ioana Baldini; Sarath Chandar

arXiv:2406.05918·cs.CL·June 11, 2024

Why Don't Prompt-Based Fairness Metrics Correlate?

Abdelrahman Zayed, Goncalo Mordido, Ioana Baldini, Sarath Chandar

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the inconsistency of prompt-based fairness metrics in large language models, identifies reasons for low correlation, and proposes CAIRO to improve metric agreement significantly.

Contribution

It introduces CAIRO, a method that enhances fairness metric correlation by augmenting prompts with multiple language models, addressing reliability issues in bias evaluation.

Findings

01

Significant increase in Pearson correlation from 0.3 and 0.18 to 0.90 and 0.98.

02

Demonstrates low initial agreement among existing fairness metrics.

03

Provides insights into reasons for poor correlation across fairness metrics.

Abstract

The widespread use of large language models has brought up essential questions about the potential biases these models might learn. This led to the development of several metrics aimed at evaluating and mitigating these biases. In this paper, we first demonstrate that prompt-based fairness metrics exhibit poor agreement, as measured by correlation, raising important questions about the reliability of fairness assessment using prompts. Then, we outline six relevant reasons why such a low correlation is observed across existing metrics. Based on these insights, we propose a method called Correlated Fairness Output (CAIRO) to enhance the correlation between fairness metrics. CAIRO augments the original prompts of a given fairness metric by using several pre-trained language models and then selects the combination of the augmented prompts that achieves the highest correlation across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chandar-lab/cairo
pytorchOfficial

Videos

Why Don’t Prompt-Based Fairness Metrics Correlate?· underline

Taxonomy

TopicsEthics and Social Impacts of AI