Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models

Riccardo Cantini; Nicola Gabriele; Alessio Orsino; Domenico Talia

arXiv:2507.02799·cs.CL·July 4, 2025

Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models

Riccardo Cantini, Nicola Gabriele, Alessio Orsino, Domenico Talia

PDF

TL;DR

This paper investigates how reasoning capabilities in language models influence their robustness to social biases, revealing that reasoning may unintentionally increase vulnerability to bias elicitation and highlighting the need for bias-aware reasoning methods.

Contribution

It provides a systematic evaluation of bias safety in reasoning language models, contrasting reasoning via fine-tuning and prompting, and introduces insights into their vulnerability to bias and jailbreak attacks.

Findings

01

Reasoning models are more vulnerable to bias than base models.

02

Models with explicit reasoning are somewhat safer than CoT prompting models.

03

Reasoning mechanisms can open pathways for stereotype reinforcement.

Abstract

Reasoning Language Models (RLMs) have gained traction for their ability to perform complex, multi-step reasoning tasks through mechanisms such as Chain-of-Thought (CoT) prompting or fine-tuned reasoning traces. While these capabilities promise improved reliability, their impact on robustness to social biases remains unclear. In this work, we leverage the CLEAR-Bias benchmark, originally designed for Large Language Models (LLMs), to investigate the adversarial robustness of RLMs to bias elicitation. We systematically evaluate state-of-the-art RLMs across diverse sociocultural dimensions, using an LLM-as-a-judge approach for automated safety scoring and leveraging jailbreak techniques to assess the strength of built-in safety mechanisms. Our evaluation addresses three key questions: (i) how the introduction of reasoning capabilities affects model fairness and robustness; (ii) whether…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsChain-of-thought prompting · Balanced Selection