Reinforcement Learning for Pollution Detection in a Randomized, Sparse and Nonstationary Environment with an Autonomous Underwater Vehicle

Sebastian Zieglmeier; Niklas Erdmann; Narada D. Warakagoda

arXiv:2510.26347·cs.LG·October 31, 2025

Reinforcement Learning for Pollution Detection in a Randomized, Sparse and Nonstationary Environment with an Autonomous Underwater Vehicle

Sebastian Zieglmeier, Niklas Erdmann, Narada D. Warakagoda

PDF

TL;DR

This paper enhances reinforcement learning algorithms to effectively detect pollution in complex, sparse, and nonstationary underwater environments using autonomous vehicles, demonstrating significant performance improvements over traditional methods.

Contribution

It introduces novel modifications to classical RL algorithms, including hierarchical strategies, multi-goal learning, and location memory integration, tailored for challenging environmental conditions.

Findings

01

Modified Monte Carlo approach outperforms Q-learning

02

Hierarchical and multi-goal strategies improve efficiency

03

Location memory prevents state revisits

Abstract

Reinforcement learning (RL) algorithms are designed to optimize problem-solving by learning actions that maximize rewards, a task that becomes particularly challenging in random and nonstationary environments. Even advanced RL algorithms are often limited in their ability to solve problems in these conditions. In applications such as searching for underwater pollution clouds with autonomous underwater vehicles (AUVs), RL algorithms must navigate reward-sparse environments, where actions frequently result in a zero reward. This paper aims to address these challenges by revisiting and modifying classical RL approaches to efficiently operate in sparse, randomized, and nonstationary environments. We systematically study a large number of modifications, including hierarchical algorithm changes, multigoal learning, and the integration of a location memory as an external output filter to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.