Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

Duanyi Yao; Changyue Li; Zhicong Huang; Cheng Hong; Songze Li

arXiv:2603.27522·cs.CL·March 31, 2026

Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong, Songze Li

PDF

TL;DR

This paper introduces Hidden Ads, a novel backdoor attack on vision-language models that exploits natural user behaviors to inject advertisements seamlessly, challenging existing defenses.

Contribution

The paper presents a new backdoor attack method that activates through natural user inputs, along with a threat framework and evaluation demonstrating its effectiveness and resilience.

Findings

01

High attack success rate with minimal false positives

02

Effective transferability across multiple datasets and domains

03

Existing defenses fail to detect or remove the backdoor without utility loss

Abstract

Vision-Language Models (VLMs) are increasingly deployed in consumer applications where users seek recommendations about products, dining, and services. We introduce Hidden Ads, a new class of backdoor attacks that exploit this recommendation-seeking behavior to inject unauthorized advertisements. Unlike traditional pattern-triggered backdoors that rely on artificial triggers such as pixel patches or special tokens, Hidden Ads activates on natural user behaviors: when users upload images containing semantic content of interest (e.g., food, cars, animals) and ask recommendation-seeking questions, the backdoored model provides correct, helpful answers while seamlessly appending attacker-specified promotional slogans. This design preserves model utility and produces natural-sounding injections, making the attack practical for real-world deployment in consumer-facing recommendation services.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.