The Persuasion Paradox: When LLM Explanations Fail to Improve Human-AI Team Performance
Ruth Cohen, Lu Feng, Ayala Bloch, Sarit Kraus

TL;DR
This study reveals that LLM explanations can increase user confidence without improving task accuracy, highlighting the importance of interface design and task type in human-AI collaboration.
Contribution
It uncovers the Persuasion Paradox where explanations boost confidence but do not necessarily enhance performance, emphasizing task-dependent effects and better interface strategies.
Findings
Explanations increase confidence but do not improve accuracy in visual reasoning tasks.
Probability-based interfaces improve accuracy and error recovery over explanation-based ones.
Language reasoning tasks benefit from explanations, outperforming other support methods.
Abstract
While natural-language explanations from large language models (LLMs) are widely adopted to improve transparency and trust, their impact on objective human-AI team performance remains poorly understood. We identify a Persuasion Paradox: fluent explanations systematically increase user confidence and reliance on AI without reliably improving, and in some cases undermining, task accuracy. Across three controlled human-subject studies spanning abstract visual reasoning (RAVEN matrices) and deductive logical reasoning (LSAT problems), we disentangle the effects of AI predictions and explanations using a multi-stage reveal design and between-subjects comparisons. In visual reasoning, LLM explanations increase confidence but do not improve accuracy beyond the AI prediction alone, and substantially suppress users' ability to recover from model errors. Interfaces exposing model uncertainty…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
