DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via   Representation Vulnerabilities

Chashi Mahiul Islam; Samuel Jacob Chacko; Preston Horne; Xiuwen Liu

arXiv:2502.07905·cs.CV·February 13, 2025

DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via Representation Vulnerabilities

Chashi Mahiul Islam, Samuel Jacob Chacko, Preston Horne, Xiuwen Liu

PDF

Open Access

TL;DR

This paper demonstrates that DeepSeek multimodal models are vulnerable to embedding manipulation attacks that induce targeted visual hallucinations with high success rates and visual quality, highlighting security concerns in open-source AI systems.

Contribution

It introduces an adapted embedding manipulation attack on DeepSeek Janus, achieving high hallucination rates and proposing a novel detection framework, revealing significant security vulnerabilities.

Findings

01

Hallucination rates up to 98% on multiple datasets

02

High visual fidelity of manipulated images (SSIM > 0.88)

03

Both 1B and 7B DeepSeek variants are susceptible

Abstract

Multimodal Large Language Models (MLLMs) represent the cutting edge of AI technology, with DeepSeek models emerging as a leading open-source alternative offering competitive performance to closed-source systems. While these models demonstrate remarkable capabilities, their vision-language integration mechanisms introduce specific vulnerabilities. We implement an adapted embedding manipulation attack on DeepSeek Janus that induces targeted visual hallucinations through systematic optimization of image embeddings. Through extensive experimentation across COCO, DALL-E 3, and SVIT datasets, we achieve hallucination rates of up to 98.0% while maintaining high visual fidelity (SSIM > 0.88) of the manipulated images on open-ended questions. Our analysis demonstrates that both 1B and 7B variants of DeepSeek Janus are susceptible to these attacks, with closed-form evaluation showing consistently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis