Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models
Riccardo Cantini, Nicola Gabriele, Alessio Orsino, Domenico Talia

TL;DR
This paper investigates how reasoning capabilities in language models influence their robustness to social biases, revealing that reasoning may unintentionally increase vulnerability to bias elicitation and highlighting the need for bias-aware reasoning methods.
Contribution
It provides a systematic evaluation of bias safety in reasoning language models, contrasting reasoning via fine-tuning and prompting, and introduces insights into their vulnerability to bias and jailbreak attacks.
Findings
Reasoning models are more vulnerable to bias than base models.
Models with explicit reasoning are somewhat safer than CoT prompting models.
Reasoning mechanisms can open pathways for stereotype reinforcement.
Abstract
Reasoning Language Models (RLMs) have gained traction for their ability to perform complex, multi-step reasoning tasks through mechanisms such as Chain-of-Thought (CoT) prompting or fine-tuned reasoning traces. While these capabilities promise improved reliability, their impact on robustness to social biases remains unclear. In this work, we leverage the CLEAR-Bias benchmark, originally designed for Large Language Models (LLMs), to investigate the adversarial robustness of RLMs to bias elicitation. We systematically evaluate state-of-the-art RLMs across diverse sociocultural dimensions, using an LLM-as-a-judge approach for automated safety scoring and leveraging jailbreak techniques to assess the strength of built-in safety mechanisms. Our evaluation addresses three key questions: (i) how the introduction of reasoning capabilities affects model fairness and robustness; (ii) whether…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsChain-of-thought prompting · Balanced Selection
