Preference Redirection via Attention Concentration: An Attack on Computer Use Agents
Dominik Seip, Matthias Hein

TL;DR
This paper introduces PRAC, an attack that manipulates computer use agents' internal preferences by redirecting attention via adversarial patches, exposing security vulnerabilities in multimodal models.
Contribution
The paper presents PRAC, a novel attention-based attack on CUAs that manipulates internal preferences, highlighting security risks in vision-language models.
Findings
PRAC successfully redirects CUA attention to target products.
The attack generalizes to fine-tuned models.
PRAC demonstrates a critical security vulnerability in multimodal models.
Abstract
Advancements in multimodal foundation models have enabled the development of Computer Use Agents (CUAs) capable of autonomously interacting with GUI environments. As CUAs are not restricted to certain tools, they allow to automate more complex agentic tasks but at the same time open up new security vulnerabilities. While prior work has concentrated on the language modality, the vulnerability of the vision modality has received less attention. In this paper, we introduce PRAC, a novel attack that, unlike prior work targeting the VLM output directly, manipulates the model's internal preferences by redirecting its attention toward a stealthy adversarial patch. We show that PRAC is able to manipulate the selection process of a CUA on an online shopping platform towards a chosen target product. While we require white-box access to the model for the creation of the attack, we show that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
