Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids

Carlos S. Sep\'ulveda; Gonzalo A. Ruz

arXiv:2603.28385·cs.LG·March 31, 2026

Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids

Carlos S. Sep\'ulveda, Gonzalo A. Ruz

PDF

TL;DR

This paper introduces a critic-free deep reinforcement learning method using Transformer-based policies for efficient maritime coverage path planning on irregular grids, outperforming heuristics in success rate and path efficiency.

Contribution

It presents a novel critic-free DRL framework with a Transformer policy for maritime CPP, handling irregular areas without re-planning and achieving significant performance improvements.

Findings

01

Achieved 99.0% success rate in synthetic environments.

02

Paths are 7% shorter and have 24% fewer heading changes.

03

Inference runs under 50 ms per instance on a laptop GPU.

Abstract

Maritime surveillance missions, such as search and rescue and environmental monitoring, rely on the efficient allocation of sensing assets over vast and geometrically complex areas. Traditional Coverage Path Planning (CPP) approaches depend on decomposition techniques that struggle with irregular coastlines, islands, and exclusion zones, or require computationally expensive re-planning for every instance. We propose a Deep Reinforcement Learning (DRL) framework to solve CPP on hexagonal grid representations of irregular maritime areas. Unlike conventional methods, we formulate the problem as a neural combinatorial optimization task where a Transformer-based pointer policy autoregressively constructs coverage tours. To overcome the instability of value estimation in long-horizon routing problems, we implement a critic-free Group-Relative Policy Optimization (GRPO) scheme. This method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.