Decentralized End-to-End Multi-AAV Pursuit Using Predictive Spatio-Temporal Observation via Deep Reinforcement Learning

Yude Li; Zhexuan Zhou; Huizhe Li; Yanke Sun; Yenan Wu; Yichen Lai; Yiming Wang; Youmin Gong; Jie Mei

arXiv:2603.24238·cs.RO·March 26, 2026

Decentralized End-to-End Multi-AAV Pursuit Using Predictive Spatio-Temporal Observation via Deep Reinforcement Learning

Yude Li, Zhexuan Zhou, Huizhe Li, Yanke Sun, Yenan Wu, Yichen Lai, Yiming Wang, Youmin Gong, Jie Mei

PDF

Open Access

TL;DR

This paper introduces a decentralized reinforcement learning framework for multi-AAV pursuit that uses raw LiDAR data and a novel predictive spatio-temporal observation to improve navigation and interception in cluttered environments.

Contribution

It presents a new end-to-end MARL approach with PSTO representation, enabling scalable, robust multi-agent pursuit without relying on privileged information.

Findings

01

Achieves higher capture efficiency than existing methods.

02

Scales seamlessly across different team sizes.

03

Validated in outdoor quadrotor experiments using only onboard sensors.

Abstract

Decentralized cooperative pursuit in cluttered environments is challenging for autonomous aerial swarms, especially under partial and noisy perception. Existing methods often rely on abstracted geometric features or privileged ground-truth states, and therefore sidestep perceptual uncertainty in real-world settings. We propose a decentralized end-to-end multi-agent reinforcement learning (MARL) framework that maps raw LiDAR observations directly to continuous control commands. Central to the framework is the Predictive Spatio-Temporal Observation (PSTO), an egocentric grid representation that aligns obstacle geometry with predictive adversarial intent and teammate motion in a unified, fixed-resolution projection. Built on PSTO, a single decentralized policy enables agents to navigate static obstacles, intercept dynamic targets, and maintain cooperative encirclement. Simulations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Control Multi-Agent Systems · Guidance and Control Systems · Reinforcement Learning in Robotics