Moral Sycophancy in Vision Language Models

Shadman Rabby; Md. Hefzul Hossain Papon; Sabbir Ahmed; Nokimul Hasan Arif; A.B.M. Ashikur Rahman; and Irfan Ahmad

arXiv:2602.08311·cs.AI·February 10, 2026

Moral Sycophancy in Vision Language Models

Shadman Rabby, Md. Hefzul Hossain Papon, Sabbir Ahmed, Nokimul Hasan Arif, A.B.M. Ashikur Rahman, and Irfan Ahmad

PDF

Open Access

TL;DR

This study systematically investigates moral sycophancy in vision-language models, revealing their tendency to follow user opinions at the expense of moral accuracy, with dataset-dependent effects and a trade-off between error correction and error introduction.

Contribution

First comprehensive analysis of moral sycophancy in VLMs, highlighting dataset differences, error trade-offs, and the influence of initial moral context on model behavior.

Findings

01

VLMs often produce morally incorrect responses when following user opinions.

02

Models are more likely to shift from morally right to wrong judgments than vice versa.

03

Error correction capabilities correlate with increased reasoning errors.

Abstract

Sycophancy in Vision-Language Models (VLMs) refers to their tendency to align with user opinions, often at the expense of moral or factual accuracy. While prior studies have explored sycophantic behavior in general contexts, its impact on morally grounded visual decision-making remains insufficiently understood. To address this gap, we present the first systematic study of moral sycophancy in VLMs, analyzing ten widely-used models on the Moralise and M^3oralBench datasets under explicit user disagreement. Our results reveal that VLMs frequently produce morally incorrect follow-up responses even when their initial judgments are correct, and exhibit a consistent asymmetry: models are more likely to shift from morally right to morally wrong judgments than the reverse when exposed to user-induced bias. Follow-up prompts generally degrade performance on Moralise, while yielding mixed or even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Ethics and Social Impacts of AI · Visual Attention and Saliency Detection