Deployable Vision-driven UAV River Navigation via Human-in-the-loop Preference Alignment

Zihan Wang; Jianwen Li; Li-Fan Wu; Nina Mahmoudian

arXiv:2511.01083·cs.RO·November 4, 2025

Deployable Vision-driven UAV River Navigation via Human-in-the-loop Preference Alignment

Zihan Wang, Jianwen Li, Li-Fan Wu, Nina Mahmoudian

PDF

Open Access

TL;DR

This paper presents SPAR-H, a human-in-the-loop learning method that efficiently adapts vision-driven UAV policies for river navigation by aligning with human preferences, improving safety and performance in real-world deployment.

Contribution

Introduction of SPAR-H, a novel hybrid preference alignment method that combines direct preference optimization with reward estimation for UAV river navigation.

Findings

01

SPAR-H achieves higher episodic rewards and lower variance than other methods.

02

The learned reward model aligns well with human preferences.

03

SPAR-H demonstrates real-world feasibility for UAV river following.

Abstract

Rivers are critical corridors for environmental monitoring and disaster response, where Unmanned Aerial Vehicles (UAVs) guided by vision-driven policies can provide fast, low-cost coverage. However, deployment exposes simulation-trained policies with distribution shift and safety risks and requires efficient adaptation from limited human interventions. We study human-in-the-loop (HITL) learning with a conservative overseer who vetoes unsafe or inefficient actions and provides statewise preferences by comparing the agent's proposal with a corrective override. We introduce Statewise Hybrid Preference Alignment for Robotics (SPAR-H), which fuses direct preference optimization on policy logits with a reward-based pathway that trains an immediate-reward estimator from the same preferences and updates the policy using a trust-region surrogate. With five HITL rollouts collected from a fixed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications