HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation

Gershom Seneviratne; Jianyu An; Sahire Ellahy; Kasun Weerakoon; Mohamed Bashir Elnoor; Jonathan Deepak Kannan; Amogha Thalihalla Sunil; Dinesh Manocha

arXiv:2508.01539·cs.RO·August 5, 2025

HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation

Gershom Seneviratne, Jianyu An, Sahire Ellahy, Kasun Weerakoon, Mohamed Bashir Elnoor, Jonathan Deepak Kannan, Amogha Thalihalla Sunil, Dinesh Manocha

PDF

Open Access

TL;DR

HALO is a new offline reward learning method that captures human navigation preferences to improve robot navigation, demonstrating superior real-world performance and generalization across environments and hardware.

Contribution

HALO introduces a novel offline reward learning algorithm that aligns robot navigation with human preferences using preference ranking and binary feedback.

Findings

01

HALO outperforms state-of-the-art methods in success rate and trajectory metrics.

02

Policies trained with HALO generalize well to unseen environments.

03

HALO is effective in both learning-based and classical navigation frameworks.

Abstract

In this paper, we introduce HALO, a novel Offline Reward Learning algorithm that quantifies human intuition in navigation into a vision-based reward function for robot navigation. HALO learns a reward model from offline data, leveraging expert trajectories collected from mobile robots. During training, actions are uniformly sampled around a reference action and ranked using preference scores derived from a Boltzmann distribution centered on the preferred action, and shaped based on binary user feedback to intuitive navigation queries. The reward model is trained via the Plackett-Luce loss to align with these ranked preferences. To demonstrate the effectiveness of HALO, we deploy its reward model in two downstream applications: (i) an offline learned policy trained directly on the HALO-derived rewards, and (ii) a model-predictive-control (MPC) based planner that incorporates the HALO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Social Robot Interaction and HRI