Using Natural Sentences for Understanding Biases in Language Models
Sarah Alnegheimish, Alicia Guo, Yi Sun

TL;DR
This paper investigates gender-occupation biases in language models by comparing template-based prompts with natural sentences from Wikipedia, highlighting the importance of prompt design in bias evaluation.
Contribution
It introduces a natural sentence prompt dataset for bias analysis and demonstrates the impact of prompt design on bias evaluation outcomes.
Findings
Bias evaluations vary significantly with prompt design
Natural sentence prompts provide more systematic bias assessments
Template prompts may introduce bias in evaluation results
Abstract
Evaluation of biases in language models is often limited to synthetically generated datasets. This dependence traces back to the need for a prompt-style dataset to trigger specific behaviors of language models. In this paper, we address this gap by creating a prompt dataset with respect to occupations collected from real-world natural sentences present in Wikipedia. We aim to understand the differences between using template-based prompts and natural sentence prompts when studying gender-occupation biases in language models. We find bias evaluations are very sensitive to the design choices of template prompts, and we propose using natural sentence prompts for systematic evaluations to step away from design choices that could introduce bias in the observations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification
