Critical or Compliant? The Double-Edged Sword of Reasoning in Chain-of-Thought Explanations

Eunkyu Park; Wesley Hanwen Deng; Vasudha Varadarajan; Mingxi Yan; Gunhee Kim; Maarten Sap; Motahhare Eslami

arXiv:2511.12001·cs.CL·November 20, 2025

Critical or Compliant? The Double-Edged Sword of Reasoning in Chain-of-Thought Explanations

Eunkyu Park, Wesley Hanwen Deng, Vasudha Varadarajan, Mingxi Yan, Gunhee Kim, Maarten Sap, Motahhare Eslami

PDF

Open Access

TL;DR

This paper investigates how Chain-of-Thought explanations in multimodal moral reasoning can both aid understanding and foster overconfidence, revealing that tone and presentation influence user trust and error detection.

Contribution

It systematically examines the double-edged role of CoT explanations in moral reasoning, highlighting how delivery style affects trust and error recognition in vision language models.

Findings

01

Users often trust outcomes over reasoning correctness.

02

Confident tones reduce error detection despite flawed reasoning.

03

Delivery style can override actual reasoning accuracy.

Abstract

Explanations are often promoted as tools for transparency, but they can also foster confirmation bias; users may assume reasoning is correct whenever outputs appear acceptable. We study this double-edged role of Chain-of-Thought (CoT) explanations in multimodal moral scenarios by systematically perturbing reasoning chains and manipulating delivery tones. Specifically, we analyze reasoning errors in vision language models (VLMs) and how they impact user trust and the ability to detect errors. Our findings reveal two key effects: (1) users often equate trust with outcome agreement, sustaining reliance even when reasoning is flawed, and (2) the confident tone suppresses error detection while maintaining reliance, showing that delivery styles can override correctness. These results highlight how CoT explanations can simultaneously clarify and mislead, underscoring the need for NLP systems…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Ethics and Social Impacts of AI