Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs

Shuo Li; Tao Ji; Xiaoran Fan; Linsheng Lu; Leyi Yang; Yuming Yang,; Zhiheng Xi; Rui Zheng; Yuran Wang; Xiaohui Zhao; Tao Gui; Qi Zhang; Xuanjing; Huang

arXiv:2410.11302·cs.CV·October 16, 2024

Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs

Shuo Li, Tao Ji, Xiaoran Fan, Linsheng Lu, Leyi Yang, Yuming Yang,, Zhiheng Xi, Rui Zheng, Yuran Wang, Xiaohui Zhao, Tao Gui, Qi Zhang, Xuanjing, Huang

PDF

Open Access 3 Reviews

TL;DR

This paper investigates sycophancy in visual language models (VLMs), introduces a benchmark to evaluate it, and proposes methods to mitigate this issue, revealing that improving image attention at higher layers reduces sycophantic behavior.

Contribution

The study extends sycophancy analysis from LLMs to VLMs, introduces the MM-SY benchmark, and proposes training methods to effectively reduce sycophancy in VLMs.

Findings

01

Higher model layers are better at preventing sycophancy.

02

Lack of image attention in higher layers may cause sycophancy.

03

Enhancing image attention at high layers mitigates sycophantic responses.

Abstract

In the study of LLMs, sycophancy represents a prevalent hallucination that poses significant challenges to these models. Specifically, LLMs often fail to adhere to original correct responses, instead blindly agreeing with users' opinions, even when those opinions are incorrect or malicious. However, research on sycophancy in visual language models (VLMs) has been scarce. In this work, we extend the exploration of sycophancy from LLMs to VLMs, introducing the MM-SY benchmark to evaluate this phenomenon. We present evaluation results from multiple representative models, addressing the gap in sycophancy research for VLMs. To mitigate sycophancy, we propose a synthetic dataset for training and employ methods based on prompts, supervised fine-tuning, and DPO. Our experiments demonstrate that these methods effectively alleviate sycophancy in VLMs. Additionally, we probe VLMs to assess the…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 3

Strengths

+ This is the first paper to investigate the hallucination problem in multi-modality language models. To address this issue, the authors construct a new evaluation benchmark that includes 10 different visual question answering (VQA) tasks. + Based on the designed benchmark, the authors investigate this problem on various popular VLMs and provide comprehensive experimental results. + Besides revealing the sycophancy phenomenon, the authors also provide three different kinds of solution to allev

Weaknesses

+ It seems that the definition of sycophancy rate is missing. Could the authors present it in Section 2? This is important for the readers to understand Table 1 and Figure 2. + In addition to revealing the sycophancy phenomenon, it would be beneficial to analyze why the current model tends to exhibit sycophancy. For example, is this comes from the training data or the network architecture?"

Reviewer 02Rating 6Confidence 4

Strengths

1. The paper offers a thorough analysis of the factors influencing sycophancy in VLMs, providing valuable insights into model behavior across different conditions. 2. The exploration of three distinct mitigation methods, each with varying degrees of success, contributes to the understanding of how to manage sycophantic behavior in VLMs. 3. The proposal of a simple, training-free method to reduce sycophancy by amplifying high-layer vision attention is innovative and has practical implications f

Weaknesses

1. The mitigation methods were only validated on a single VLM (LLaVA-1.5-7B), which limits the generalizability of the findings. It's unclear how these methods would perform across different VLM architectures. 2. The paper mentions that due to time and computational resource constraints, the analysis was limited. This suggests that the findings may not be exhaustive and could benefit from further exploration with additional resources.

Reviewer 03Rating 6Confidence 4

Strengths

1. This paper is well written, clearly articulating the progressively detailed research approach to the sycophancy issue in VLMs. 2. Experiments are comprehensive, thoroughly testing multiple VLM models, various tasks, and different user preferences, and analyzing the relationship between sycophancy and various dimensions. 3. By studying the attention weights at different layers, this work reveals the model's performance in mitigating the sycophancy problem.

Weaknesses

1. This paper mentions the contradiction between sycophancy and stubbornness issues, so for the VLM model, the real problem that needs to be addressed is to reduce sycophancy while maintaining the acceptance of correct opinions. However, methods such as prompt guidance, DPO, and amplify attention seem to reduce sycophancy but at the same time increase stubbornness to an equal extent. This does not truly solve the problem. It is merely shifting the imbalance from one side of the seesaw to the oth

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSleep and Wakefulness Research

MethodsSoftmax · Attention Is All You Need · Direct Preference Optimization