Scaling Multi Agent Reinforcement Learning for Underwater Acoustic Tracking via Autonomous Vehicles

Matteo Gallici; Ivan Masmitja; Mario Mart\'in

arXiv:2505.08222·cs.RO·October 20, 2025

Scaling Multi Agent Reinforcement Learning for Underwater Acoustic Tracking via Autonomous Vehicles

Matteo Gallici, Ivan Masmitja, Mario Mart\'in

PDF

TL;DR

This paper presents a scalable framework combining iterative simulation distillation and a Transformer-based MARL architecture, enabling efficient training and deployment of autonomous underwater vehicle fleets for multi-target tracking with high accuracy.

Contribution

It introduces a novel iterative distillation method for high-fidelity simulation acceleration and a Transformer-based MARL model that is invariant to the number of agents and targets.

Findings

01

Achieves up to 30,000x speedup in simulation training.

02

Maintains tracking errors below 5 meters in complex scenarios.

03

Enables scalable, high-precision underwater tracking with autonomous vehicles.

Abstract

Autonomous vehicles (AV) offer a cost-effective solution for scientific missions such as underwater tracking. Recently, reinforcement learning (RL) has emerged as a powerful method for controlling AVs in complex marine environments. However, scaling these techniques to a fleet--essential for multi-target tracking or targets with rapid, unpredictable motion--presents significant computational challenges. Multi-Agent Reinforcement Learning (MARL) is notoriously sample-inefficient, and while high-fidelity simulators like Gazebo's LRAUV provide 100x faster-than-real-time single-robot simulations, they offer no significant speedup for multi-vehicle scenarios, making MARL training impractical. To address these limitations, we propose an iterative distillation method that transfers high-fidelity simulations into a simplified, GPU-accelerated environment while preserving high-level dynamics.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.