How Are LLMs Mitigating Stereotyping Harms? Learning from Search Engine Studies
Alina Leidinger, Richard Rogers

TL;DR
This paper evaluates how large language models (LLMs) mitigate stereotyping harms, revealing improvements with safety prompts but persistent biases, especially regarding certain social groups, and discusses implications for policy and model development.
Contribution
It introduces a novel autocompletion prompt-based evaluation for stereotyping in LLMs and provides empirical insights into their effectiveness and limitations in mitigating social harms.
Findings
Safety prompts reduce stereotyping but do not eliminate it.
LLMs show biases towards certain ethnicities and sexual orientations.
Intersectional identities trigger more stereotyping.
Abstract
With the widespread availability of LLMs since the release of ChatGPT and increased public scrutiny, commercial model development appears to have focused their efforts on 'safety' training concerning legal liabilities at the expense of social impact evaluation. This mimics a similar trend which we could observe for search engine autocompletion some years prior. We draw on scholarship from NLP and search engine auditing and present a novel evaluation task in the style of autocompletion prompts to assess stereotyping in LLMs. We assess LLMs by using four metrics, namely refusal rates, toxicity, sentiment and regard, with and without safety system prompts. Our findings indicate an improvement to stereotyping outputs with the system prompt, but overall a lack of attention by LLMs under study to certain harms classified as toxic, particularly for prompts about peoples/ethnicities and sexual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinTech, Crowdfunding, Digital Finance · Cybercrime and Law Enforcement Studies
MethodsSoftmax · Attention Is All You Need
