NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks

Zhihao Luo; Wentao Yan; Jingyu Gong; Min Wang; Zhizhong Zhang; Xuhong Wang; Yuan Xie; Xin Tan

arXiv:2508.02046·cs.RO·May 5, 2026

NaviMaster: Learning a Unified Policy for GUI and Embodied Navigation Tasks

Zhihao Luo, Wentao Yan, Jingyu Gong, Min Wang, Zhizhong Zhang, Xuhong Wang, Yuan Xie, Xin Tan

PDF

1 Repo

TL;DR

NaviMaster introduces a unified reinforcement learning framework that combines GUI and embodied navigation tasks, leveraging a shared data collection and reward strategy to improve generalization and performance across benchmarks.

Contribution

It is the first to unify GUI and embodied navigation tasks within a single framework using a common MDP formulation and training pipeline.

Findings

01

Outperforms state-of-the-art in GUI navigation and embodied tasks.

02

Effective data mixing and reward design improve learning efficiency.

03

Unified training strategy enhances generalization across benchmarks.

Abstract

Recent advances in Graphical User Interface (GUI) and embodied navigation have driven progress, yet these domains have largely evolved in isolation, with disparate datasets and training paradigms. In this paper, we observe that both tasks can be formulated as Markov Decision Processes (MDP), suggesting a foundational principle for their unification. Hence, we present NaviMaster, the first unified agent capable of unifying GUI navigation and embodied navigation within a single framework. Specifically, NaviMaster (i) proposes a visual-target trajectory collection pipeline that generates trajectories for both GUI and embodied tasks using a single formulation. (ii) employs a unified reinforcement learning framework on the mix data to improve generalization. (iii) designs a novel distance-aware reward to ensure efficient learning from the trajectories. Through extensive experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://iron-boyy.github.io/navimaster-page
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.