Attacking LLMs and AI Agents: Advertisement Embedding Attacks Against Large Language Models
Qiming Guo, Jinwen Tang, Xingran Huang

TL;DR
This paper introduces Advertisement Embedding Attacks (AEA), a novel security threat to large language models that covertly inject promotional or malicious content, highlighting the need for improved detection and mitigation strategies.
Contribution
The paper presents AEA as a new class of attacks on LLMs, detailing their mechanisms, impact, and a preliminary defense approach, emphasizing the security vulnerabilities of current models.
Findings
AEA can inject covert ads and propaganda into LLM outputs.
Current defenses are insufficient against stealthy AEA injections.
The attack pipeline affects multiple stakeholder groups.
Abstract
We introduce Advertisement Embedding Attacks (AEA), a new class of LLM security threats that stealthily inject promotional or malicious content into model outputs and AI agents. AEA operate through two low-cost vectors: (1) hijacking third-party service-distribution platforms to prepend adversarial prompts, and (2) publishing back-doored open-source checkpoints fine-tuned with attacker data. Unlike conventional attacks that degrade accuracy, AEA subvert information integrity, causing models to return covert ads, propaganda, or hate speech while appearing normal. We detail the attack pipeline, map five stakeholder victim groups, and present an initial prompt-based self-inspection defense that mitigates these injections without additional model retraining. Our findings reveal an urgent, under-addressed gap in LLM security and call for coordinated detection, auditing, and policy responses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
