A Comprehensive Study of Implicit and Explicit Biases in Large Language Models
Fatima Kazi, Alex Young, Yash Inani, Setareh Rafatirad

TL;DR
This study evaluates biases in large language models using benchmarks and proposes an automated framework for bias detection, revealing strengths and weaknesses in identifying explicit and implicit biases, and demonstrating improvements through fine-tuning.
Contribution
It introduces an automated Bias-Identification Framework and demonstrates how fine-tuning enhances bias detection in LLMs across multiple social categories.
Findings
Fine-tuned models better detect racial biases.
Models struggle with gender biases despite fine-tuning.
Fine-tuning improves bias detection performance by up to 20%.
Abstract
Large Language Models (LLMs) inherit explicit and implicit biases from their training datasets. Identifying and mitigating biases in LLMs is crucial to ensure fair outputs, as they can perpetuate harmful stereotypes and misinformation. This study highlights the need to address biases in LLMs amid growing generative AI. We studied bias-specific benchmarks such as StereoSet and CrowSPairs to evaluate the existence of various biases in multiple generative models such as BERT and GPT 3.5. We proposed an automated Bias-Identification Framework to recognize various social biases in LLMs such as gender, race, profession, and religion. We adopted a two-pronged approach to detect explicit and implicit biases in text data. Results indicated fine-tuned models struggle with gender biases but excelled at identifying and avoiding racial biases. Our findings illustrated that despite having some…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Artificial Intelligence in Healthcare and Education · Ethics and Social Impacts of AI
