Deployable Vision-driven UAV River Navigation via Human-in-the-loop Preference Alignment
Zihan Wang, Jianwen Li, Li-Fan Wu, Nina Mahmoudian

TL;DR
This paper presents SPAR-H, a human-in-the-loop learning method that efficiently adapts vision-driven UAV policies for river navigation by aligning with human preferences, improving safety and performance in real-world deployment.
Contribution
Introduction of SPAR-H, a novel hybrid preference alignment method that combines direct preference optimization with reward estimation for UAV river navigation.
Findings
SPAR-H achieves higher episodic rewards and lower variance than other methods.
The learned reward model aligns well with human preferences.
SPAR-H demonstrates real-world feasibility for UAV river following.
Abstract
Rivers are critical corridors for environmental monitoring and disaster response, where Unmanned Aerial Vehicles (UAVs) guided by vision-driven policies can provide fast, low-cost coverage. However, deployment exposes simulation-trained policies with distribution shift and safety risks and requires efficient adaptation from limited human interventions. We study human-in-the-loop (HITL) learning with a conservative overseer who vetoes unsafe or inefficient actions and provides statewise preferences by comparing the agent's proposal with a corrective override. We introduce Statewise Hybrid Preference Alignment for Robotics (SPAR-H), which fuses direct preference optimization on policy logits with a reward-based pathway that trains an immediate-reward estimator from the same preferences and updates the policy using a trust-region surrogate. With five HITL rollouts collected from a fixed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications
