SEBA: Sample-Efficient Black-Box Attacks on Visual Reinforcement Learning
Tairan Huang, Yulin Jin, Junxu Liu, Qingqing Ye, Haibo Hu

TL;DR
SEBA is a novel, sample-efficient black-box attack framework on visual reinforcement learning that uses a shadow model, generative perturbations, and environment simulation to effectively reduce rewards with minimal queries.
Contribution
SEBA introduces a combined approach with a shadow Q model, GAN-based perturbations, and a world model for efficient black-box attacks on visual RL agents.
Findings
Significantly reduces cumulative rewards in experiments.
Maintains high visual fidelity of adversarial perturbations.
Requires fewer environment interactions than prior methods.
Abstract
Visual reinforcement learning has achieved remarkable progress in visual control and robotics, but its vulnerability to adversarial perturbations remains underexplored. Most existing black-box attacks focus on vector-based or discrete-action RL, and their effectiveness on image-based continuous control is limited by the large action space and excessive environment queries. We propose SEBA, a sample-efficient framework for black-box adversarial attacks on visual RL agents. SEBA integrates a shadow Q model that estimates cumulative rewards under adversarial conditions, a generative adversarial network that produces visually imperceptible perturbations, and a world model that simulates environment dynamics to reduce real-world queries. Through a two-stage iterative training procedure that alternates between learning the shadow model and refining the generator, SEBA achieves strong attack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
