UNIDOOR: A Universal Framework for Action-Level Backdoor Attacks in Deep   Reinforcement Learning

Oubo Ma; Linkang Du; Yang Dai; Chunyi Zhou; Qingming Li; Yuwen Pu,; Shouling Ji

arXiv:2501.15529·cs.LG·January 28, 2025

UNIDOOR: A Universal Framework for Action-Level Backdoor Attacks in Deep Reinforcement Learning

Oubo Ma, Linkang Du, Yang Dai, Chunyi Zhou, Qingming Li, Yuwen Pu,, Shouling Ji

PDF

Open Access 1 Repo

TL;DR

UNIDOOR introduces a universal, adaptive framework for action-level backdoor attacks in deep reinforcement learning, improving attack effectiveness and stealthiness across diverse scenarios without expert tuning.

Contribution

It presents the first adaptive, universal backdoor attack framework for DRL that does not rely on fixed reward functions or expert knowledge, enhancing attack robustness.

Findings

01

Significantly improves attack success across various DRL scenarios.

02

Demonstrates universality in diverse environments and action spaces.

03

Shows stealthiness through visualization of state and neuron activations.

Abstract

Deep reinforcement learning (DRL) is widely applied to safety-critical decision-making scenarios. However, DRL is vulnerable to backdoor attacks, especially action-level backdoors, which pose significant threats through precise manipulation and flexible activation, risking outcomes like vehicle collisions or drone crashes. The key distinction of action-level backdoors lies in the utilization of the backdoor reward function to associate triggers with target actions. Nevertheless, existing studies typically rely on backdoor reward functions with fixed values or conditional flipping, which lack universality across diverse DRL tasks and backdoor designs, resulting in fluctuations or even failure in practice. This paper proposes the first universal action-level backdoor attack framework, called UNIDOOR, which enables adaptive exploration of backdoor reward functions through performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maoubo/unidoor
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCardiac electrophysiology and arrhythmias · Adversarial Robustness in Machine Learning