When Backdoors Meet Partial Observability: Attacking Real-World Reinforcement Learning

Tairan Huang; Qingqing Ye; Yulin Jin; Jiawei Lian; Yaxin Xiao; Yi Wang; Haibo Hu

arXiv:2601.14104·cs.RO·May 14, 2026

When Backdoors Meet Partial Observability: Attacking Real-World Reinforcement Learning

Tairan Huang, Qingqing Ye, Yulin Jin, Jiawei Lian, Yaxin Xiao, Yi Wang, Haibo Hu

PDF

1 Repo

TL;DR

This paper introduces DGBA, a diffusion-guided backdoor attack framework for real-world reinforcement learning that uses printable visual triggers and a stochastic trigger distribution to maintain attack consistency amid uncontrollable states.

Contribution

It proposes a novel diffusion-based trigger learning method and an advantage-based poisoning strategy for effective real-world RL backdoor attacks.

Findings

01

DGBA outperforms prior RL backdoor attacks in physical TurtleBot3 experiments.

02

DGBA maintains normal task performance while executing malicious behaviors.

03

The approach is robust against variations in uncontrollable states.

Abstract

Backdoor attacks can cause reinforcement learning (RL) policies to behave normally under clean inputs while executing malicious behaviors when triggers are present. Existing RL backdoor attacks are primarily studied in simulation and often assume that attackers can reliably manipulate the observations driving policy decisions. This assumption becomes fragile in real-world deployment, where RL policies commonly rely on multimodal observations. Attackers can manipulate visual inputs through physical triggers, but auxiliary states such as LiDAR and odometry signals remain uncontrollable and vary across trajectories. We study this overlooked challenge and propose a diffusion-guided backdoor attack framework (DGBA) for real-world RL. DGBA uses small printable visual patches as triggers and learns a stochastic trigger distribution via conditional diffusion to maintain consistent attack…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.