Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles

Fatima Jahara; Mark Dredze; Sharon Levy

arXiv:2511.06160·cs.AI·November 11, 2025

Evaluating Implicit Biases in LLM Reasoning through Logic Grid Puzzles

Fatima Jahara, Mark Dredze, Sharon Levy

PDF

Open Access

TL;DR

This paper introduces PRIME, a new framework using logic grid puzzles to systematically evaluate and quantify implicit social biases, especially gender stereotypes, in large language models' reasoning processes.

Contribution

We propose PRIME, an innovative evaluation method that leverages logic puzzles to detect subtle social biases in LLM reasoning, enabling controlled, automated, and nuanced bias analysis.

Findings

01

Models reason more accurately with stereotypical solutions.

02

PRIME effectively reveals gender biases in LLM reasoning.

03

The framework allows for controlled bias comparisons.

Abstract

While recent safety guardrails effectively suppress overtly biased outputs, subtler forms of social bias emerge during complex logical reasoning tasks that evade current evaluation benchmarks. To fill this gap, we introduce a new evaluation framework, PRIME (Puzzle Reasoning for Implicit Biases in Model Evaluation), that uses logic grid puzzles to systematically probe the influence of social stereotypes on logical reasoning and decision making in LLMs. Our use of logic puzzles enables automatic generation and verification, as well as variability in complexity and biased settings. PRIME includes stereotypical, anti-stereotypical, and neutral puzzle variants generated from a shared puzzle structure, allowing for controlled and fine-grained comparisons. We evaluate multiple model families across puzzle sizes and test the effectiveness of prompt-based mitigation strategies. Focusing our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning