Curriculum-Guided Antifragile Reinforcement Learning for Secure UAV Deconfliction under Observation-Space Attacks
Deepak Kumar Panda, Adolfo Perrusquia, Weisi Guo

TL;DR
This paper introduces an antifragile reinforcement learning framework for UAV navigation that adapts to adversarial observation-space attacks, improving safety and resilience in dynamic, threat-rich environments.
Contribution
It proposes a novel antifragile RL approach with theoretical analysis and iterative critic alignment to enhance robustness against observation perturbations in UAV deconfliction.
Findings
Antifragile policy outperforms standard RL in attack scenarios.
Achieves up to 15% higher cumulative reward under attacks.
Over 30% reduction in conflict events during evaluations.
Abstract
Reinforcement learning (RL) policies deployed in safety-critical systems, such as unmanned aerial vehicle (UAV) navigation in dynamic airspace, are vulnerable to out-ofdistribution (OOD) adversarial attacks in the observation space. These attacks induce distributional shifts that significantly degrade value estimation, leading to unsafe or suboptimal decision making rendering the existing policy fragile. To address this vulnerability, we propose an antifragile RL framework designed to adapt against curriculum of incremental adversarial perturbations. The framework introduces a simulated attacker which incrementally increases the strength of observation-space perturbations which enables the RL agent to adapt and generalize across a wider range of OOD observations and anticipate previously unseen attacks. We begin with a theoretical characterization of fragility, formally defining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Smart Grid Security and Resilience
MethodsGreedy Policy Search
