TL;DR
This paper introduces DGBA, a diffusion-guided backdoor attack framework for real-world reinforcement learning that uses printable visual triggers and a stochastic trigger distribution to maintain attack consistency amid uncontrollable states.
Contribution
It proposes a novel diffusion-based trigger learning method and an advantage-based poisoning strategy for effective real-world RL backdoor attacks.
Findings
DGBA outperforms prior RL backdoor attacks in physical TurtleBot3 experiments.
DGBA maintains normal task performance while executing malicious behaviors.
The approach is robust against variations in uncontrollable states.
Abstract
Backdoor attacks can cause reinforcement learning (RL) policies to behave normally under clean inputs while executing malicious behaviors when triggers are present. Existing RL backdoor attacks are primarily studied in simulation and often assume that attackers can reliably manipulate the observations driving policy decisions. This assumption becomes fragile in real-world deployment, where RL policies commonly rely on multimodal observations. Attackers can manipulate visual inputs through physical triggers, but auxiliary states such as LiDAR and odometry signals remain uncontrollable and vary across trajectories. We study this overlooked challenge and propose a diffusion-guided backdoor attack framework (DGBA) for real-world RL. DGBA uses small printable visual patches as triggers and learns a stochastic trigger distribution via conditional diffusion to maintain consistent attack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
