Diffusion Guided Adversarial State Perturbations in Reinforcement Learning

Xiaolin Sun; Feidi Liu; Zhengming Ding; ZiZhan Zheng

arXiv:2511.07701·cs.LG·November 12, 2025

Diffusion Guided Adversarial State Perturbations in Reinforcement Learning

Xiaolin Sun, Feidi Liu, Zhengming Ding, ZiZhan Zheng

PDF

Open Access 1 Video

TL;DR

This paper introduces SHIFT, a diffusion-based attack method that creates semantically meaningful adversarial state perturbations in reinforcement learning, exposing vulnerabilities of current defenses and emphasizing the need for more robust policies.

Contribution

The paper presents SHIFT, a novel diffusion-guided attack that surpasses traditional norm-based attacks by generating realistic, semantics-altering adversarial states in RL environments.

Findings

01

SHIFT effectively breaks existing defenses against RL attacks.

02

The attack produces more perceptually stealthy adversarial states.

03

Current defenses are vulnerable to semantics-aware perturbations.

Abstract

Reinforcement learning (RL) systems, while achieving remarkable success across various domains, are vulnerable to adversarial attacks. This is especially a concern in vision-based environments where minor manipulations of high-dimensional image inputs can easily mislead the agent's behavior. To this end, various defenses have been proposed recently, with state-of-the-art approaches achieving robust performance even under large state perturbations. However, after closer investigation, we found that the effectiveness of the current defenses is due to a fundamental weakness of the existing $l_{p}$ norm-constrained attacks, which can barely alter the semantics of image input even under a relatively large perturbation budget. In this work, we propose SHIFT, a novel policy-agnostic diffusion-based state perturbation attack to go beyond this limitation. Our attack is able to generate perturbed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Diffusion Guided Adversarial State Perturbations in Reinforcement Learning· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)