When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

Arahan Kujur

arXiv:2605.16312·cs.LG·May 19, 2026

When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

Arahan Kujur

PDF

TL;DR

This paper investigates adversarial action masking in self-play reinforcement learning, revealing its significant impact on various algorithms and identifying action availability as a key robustness factor.

Contribution

It introduces the concept of adversarial action removal, demonstrating its effectiveness across multiple RL algorithms and domains, and proposes mechanisms to understand its impact.

Findings

01

Learned masking causes more damage than random masking.

02

The attack persists across various RL algorithms and transfers between agents.

03

Action availability is a critical robustness surface in self-play RL.

Abstract

We study adversarial action masking in self-play reinforcement learning: an attacker selectively removes legal actions from a victim's action set. Unlike observation or action perturbations, removal eliminates decision options before the agent acts. Across poker games scaling from 6 to 5,531 information states and two non-poker domains, learned masking causes substantially more damage than random masking and learned perturbation baselines. The attack persists across Q-learning, PPO, NFSP, neural NFSP, and DQN victims; transfers across agents; is amplified by self-play; and shows no recovery under extended masked training. Mechanistically, the adversary targets high-value decision points, captured by reach-weighted contingent action capacity (CAC $_{w}$ ) and a value-weighted refinement CAC $_{v}$ . These results identify action availability as a distinct robustness surface in self-play RL.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.