Angel or Demon: Investigating the Plasticity Interventions' Impact on Backdoor Threats in Deep Reinforcement Learning

Oubo Ma; Ruixiao Lin; Yang Dai; Jiahao Chen; Chunyi Zhou; Linkang Du; Shouling Ji

arXiv:2605.14587·cs.LG·May 15, 2026

Angel or Demon: Investigating the Plasticity Interventions' Impact on Backdoor Threats in Deep Reinforcement Learning

Oubo Ma, Ruixiao Lin, Yang Dai, Jiahao Chen, Chunyi Zhou, Linkang Du, Shouling Ji

PDF

TL;DR

This paper empirically investigates how plasticity interventions in deep reinforcement learning affect backdoor vulnerabilities, revealing that most interventions mitigate threats except SAM, which worsens them, and proposing new detection insights.

Contribution

It systematically studies 14,664 cases of interventions and attacks, introduces the SCC framework, and identifies loss landscape sharpness as a backdoor detection indicator.

Findings

01

Only SAM intervention worsens backdoor threats.

02

Most interventions mitigate backdoor vulnerabilities.

03

Loss landscape sharpness indicates backdoor presence.

Abstract

Extensive research has highlighted the severe threats posed by backdoor attacks to deep reinforcement learning (DRL). However, prior studies primarily focus on vanilla scenarios, while plasticity interventions have emerged as indispensable built-in components of modern DRL agents. Despite their effectiveness in mitigating plasticity loss, the impact of these interventions on DRL backdoor vulnerabilities remains underexplored, and this lack of systematic investigation poses risks in practical DRL deployments. To bridge this gap, we empirically study 14,664 cases integrating representative interventions and attack scenarios. We find that only one intervention (i.e., SAM) exacerbates backdoor threats, while other interventions mitigate them. Pathological analysis identifies that the exacerbation is attributed to backdoor gradient amplification, while the mitigation stems from activation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.