Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games

Alejandro Sanchez Roncero; Yixi Cai; Olov Andersson; Petter Ogren

arXiv:2506.02849·cs.RO·September 16, 2025

Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games

Alejandro Sanchez Roncero, Yixi Cai, Olov Andersson, Petter Ogren

PDF

Open Access

TL;DR

This paper introduces an asynchronous multi-stage population-based reinforcement learning algorithm to train agile quadrotor controllers for pursuit-evasion, addressing non-stationarity and catastrophic forgetting, and demonstrating superior performance in simulation.

Contribution

The paper proposes the AMSPB algorithm for stable multi-stage training of quadrotor controllers, improving pursuit-evasion performance and generalization in high-fidelity simulations.

Findings

01

AMSPB-trained policies outperform baseline methods.

02

Body-rate-and-thrust controllers enable more agile flight.

03

Policies generalize well across different arena sizes.

Abstract

We address the problem of agile 1v1 quadrotor pursuit-evasion, where a pursuer and an evader learn to outmaneuver each other through reinforcement learning (RL). Such settings face two major challenges: non-stationarity, since each agent's evolving policy alters the environment dynamics and destabilizes training, and catastrophic forgetting, where a policy overfits to the current adversary and loses effectiveness against previously encountered strategies. To tackle these issues, we propose an Asynchronous Multi-Stage Population-Based (AMSPB) algorithm. At each stage, the pursuer and evader are trained asynchronously against a frozen pool of opponents sampled from a growing population of past and current policies, stabilizing training and ensuring exposure to diverse behaviors. Within this framework, we train neural network controllers that output either velocity commands or body rates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Guidance and Control Systems