Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large   Language Models

Virginia K. Felkner; Ho-Chun Herbert Chang; Eugene Jang; Jonathan May

arXiv:2206.11484·cs.CL·July 11, 2022·5 cites

Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

Virginia K. Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May

PDF

Open Access

TL;DR

This paper introduces WinoQueer, a benchmark dataset to measure anti-queer bias in large language models like BERT, and demonstrates bias mitigation through targeted fine-tuning on LGBTQ+ authored data.

Contribution

The paper develops WinoQueer, a novel benchmark for detecting anti-queer bias, and proposes a fine-tuning method to reduce such biases in LLMs.

Findings

01

BERT exhibits significant homophobic bias.

02

Fine-tuning on LGBTQ+ data reduces bias effectively.

03

Bias mitigation is achievable with targeted fine-tuning.

Abstract

This paper presents exploratory work on whether and to what extent biases against queer and trans people are encoded in large language models (LLMs) such as BERT. We also propose a method for reducing these biases in downstream tasks: finetuning the models on data written by and/or about queer people. To measure anti-queer bias, we introduce a new benchmark dataset, WinoQueer, modeled after other bias-detection benchmarks but addressing homophobic and transphobic biases. We found that BERT shows significant homophobic bias, but this bias can be mostly mitigated by finetuning BERT on a natural language corpus written by members of the LGBTQ+ community.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Softmax · Weight Decay · Linear Warmup With Linear Decay · Residual Connection · Adam · Layer Normalization · Attention Dropout · Dropout