Continuous Control with Deep Reinforcement Learning for Autonomous Vessels
Nader Zare, Bruno Brandoli, Mahtab Sarvmaili, Amilcar Soares, and Stan Matwin

TL;DR
This paper introduces a state-action rotation strategy to enhance deep reinforcement learning for autonomous vessel navigation, significantly improving generalization and robustness in unseen maritime environments.
Contribution
The paper proposes a novel state-action rotation method combined with Deep Deterministic Policy Gradient to improve generalization in autonomous maritime navigation tasks.
Findings
Up to 11.96% improvement in rate of arrival to destination.
Up to 30.82% better performance in unseen environments.
Enhanced robustness and generalization in maritime scenarios.
Abstract
Maritime autonomous transportation has played a crucial role in the globalization of the world economy. Deep Reinforcement Learning (DRL) has been applied to automatic path planning to simulate vessel collision avoidance situations in open seas. End-to-end approaches that learn complex mappings directly from the input have poor generalization to reach the targets in different environments. In this work, we present a new strategy called state-action rotation to improve agent's performance in unseen situations by rotating the obtained experience (state-action-state) and preserving them in the replay buffer. We designed our model based on Deep Deterministic Policy Gradient, local view maker, and planner. Our agent uses two deep Convolutional Neural Networks to estimate the policy and action-value functions. The proposed model was exhaustively trained and tested in maritime scenarios with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMaritime Navigation and Safety · Autonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms
