Towards Bio-inspired Heuristically Accelerated Reinforcement Learning   for Adaptive Underwater Multi-Agents Behaviour

Antoine Vivien; Thomas Chaffre; Matthew Stephenson; Eva; Artusi; Paulo Santos; Benoit Clement; Karl Sammut

arXiv:2502.06113·cs.RO·February 11, 2025

Towards Bio-inspired Heuristically Accelerated Reinforcement Learning for Adaptive Underwater Multi-Agents Behaviour

Antoine Vivien, Thomas Chaffre, Matthew Stephenson, Eva, Artusi, Paulo Santos, Benoit Clement, Karl Sammut

PDF

TL;DR

This paper introduces a biologically inspired heuristic to accelerate multi-agent reinforcement learning for underwater coverage tasks, reducing training time and improving efficiency in complex, uncertain environments.

Contribution

It proposes a novel integration of PSO-based heuristics into MARL to enhance convergence speed for underwater multi-agent coverage problems.

Findings

01

Accelerated convergence in MARL training using PSO heuristics.

02

Effective coverage planning in complex underwater environments.

03

Reduced number of interactions needed for optimal policy learning.

Abstract

This paper describes the problem of coordination of an autonomous Multi-Agent System which aims to solve the coverage planning problem in a complex environment. The considered applications are the detection and identification of objects of interest while covering an area. These tasks, which are highly relevant for space applications, are also of interest among various domains including the underwater context, which is the focus of this study. In this context, coverage planning is traditionally modelled as a Markov Decision Process where a coordinated MAS, a swarm of heterogeneous autonomous underwater vehicles, is required to survey an area and search for objects. This MDP is associated with several challenges: environment uncertainties, communication constraints, and an ensemble of hazards, including time-varying and unpredictable changes in the underwater environment. MARL algorithms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.