Bring Your Own Prompts: Use-Case-Specific Bias and Fairness Evaluation for LLMs

Dylan Bouchard

arXiv:2407.10853·cs.CL·May 12, 2026·2 cites

Bring Your Own Prompts: Use-Case-Specific Bias and Fairness Evaluation for LLMs

Dylan Bouchard

PDF

1 Repo

TL;DR

This paper introduces a decision framework for selecting bias and fairness metrics tailored to specific LLM deployment contexts, emphasizing the importance of context-aware evaluation over generic benchmarks.

Contribution

It proposes a systematic approach to match use cases with relevant fairness metrics and releases an open-source library for practical implementation.

Findings

01

Fairness risks vary significantly across different prompt populations.

02

Benchmark performance alone is insufficient for reliable fairness assessment.

03

The framework effectively guides context-specific bias and fairness evaluation.

Abstract

Bias and fairness risks in Large Language Models (LLMs) vary substantially across deployment contexts, yet existing approaches lack systematic guidance for selecting appropriate evaluation metrics. We present a decision framework that maps LLM use cases, characterized by a model and population of prompts, to relevant bias and fairness metrics based on task type, whether prompts contain protected attribute mentions, and stakeholder priorities. Our framework addresses toxicity, stereotyping, counterfactual unfairness, and allocational harms, and introduces novel metrics based on stereotype classifiers and counterfactual adaptations of text similarity measures. We release an open-source Python library, \texttt{langfair}, for practical adoption. Extensive experiments on use cases across five LLMs and five prompt populations demonstrate that fairness risks cannot be reliably assessed from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.