Vision-driven River Following of UAV via Safe Reinforcement Learning using Semantic Dynamics Model
Zihan Wang, Nina Mahmoudian

TL;DR
This paper introduces a novel safe reinforcement learning framework for UAV river following that combines a semantic dynamics model with a new advantage estimation method, improving safety and efficiency in complex river environments.
Contribution
It proposes three key innovations: Marginal Gain Advantage Estimation, a Semantic Dynamics Model, and the Constrained Actor Dynamics Estimator architecture, advancing safe RL in vision-driven UAV navigation.
Findings
MGAE converges faster and outperforms traditional methods.
SDM provides accurate short-term predictions for safety.
CADE effectively balances safety and reward in simulation.
Abstract
Vision-driven autonomous river following by Unmanned Aerial Vehicles is critical for applications such as rescue, surveillance, and environmental monitoring, particularly in dense riverine environments where GPS signals are unreliable. These safety-critical navigation tasks must satisfy hard safety constraints while optimizing performance. Moreover, the reward in river following is inherently history-dependent (non-Markovian) by which river segment has already been visited, making it challenging for standard safe Reinforcement Learning (SafeRL). To address these gaps, we propose three contributions. First, we introduce Marginal Gain Advantage Estimation, which refines the reward advantage function by using a sliding window baseline computed from historical episodic returns, aligning the advantage estimate with non-Markovian dynamics. Second, we develop a Semantic Dynamics Model based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
