The Security Threat of Compressed Projectors in Large Vision-Language Models

Yudong Zhang; Ruobing Xie; Xingwu Sun; Jiansheng Chen; Zhanhui Kang; Di Wang; Yu Wang

arXiv:2506.00534·cs.CR·October 7, 2025

The Security Threat of Compressed Projectors in Large Vision-Language Models

Yudong Zhang, Ruobing Xie, Xingwu Sun, Jiansheng Chen, Zhanhui Kang, Di Wang, Yu Wang

PDF

Open Access 1 Video

TL;DR

This paper investigates the security vulnerabilities of compressed versus uncompressed visual language projectors in large vision-language models, revealing that compressed projectors are significantly more vulnerable to adversarial attacks.

Contribution

It provides the first comprehensive security comparison between compressed and uncompressed projectors in LVLMs, highlighting the robustness of uncompressed projectors.

Findings

01

Compressed projectors have significant vulnerabilities to adversarial attacks.

02

Uncompressed projectors demonstrate robust security properties.

03

Guidance for selecting secure visual language projectors.

Abstract

The choice of a suitable visual language projector (VLP) is critical to the successful training of large visual language models (LVLMs). Mainstream VLPs can be broadly categorized into compressed and uncompressed projectors, and each offers distinct advantages in performance and computational efficiency. However, their security implications have not been thoroughly examined. Our comprehensive evaluation reveals significant differences in their security profiles: compressed projectors exhibit substantial vulnerabilities, allowing adversaries to successfully compromise LVLMs even with minimal knowledge of structure information. In stark contrast, uncompressed projectors demonstrate robust security properties and do not introduce additional vulnerabilities. These findings provide critical guidance for researchers in selecting optimal VLPs that enhance the security and reliability of visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Security Threat of Compressed Projectors in Large Vision-Language Models· underline

Taxonomy

TopicsDigital Media Forensic Detection