PILOT: A Perceptive Integrated Low-level Controller for Loco-manipulation over Unstructured Scenes

Xinru Cui; Linxi Feng; Yixuan Zhou; Haoqi Han; Zhe Liu; and Hesheng Wang

arXiv:2601.17440·cs.RO·January 27, 2026

PILOT: A Perceptive Integrated Low-level Controller for Loco-manipulation over Unstructured Scenes

Xinru Cui, Linxi Feng, Yixuan Zhou, Haoqi Han, Zhe Liu, and Hesheng Wang

PDF

Open Access

TL;DR

PILOT introduces a unified reinforcement learning framework that integrates perceptive locomotion and manipulation for humanoid robots, improving stability and terrain handling in unstructured environments.

Contribution

It presents a novel single-stage RL approach with a cross-modal encoder and Mixture-of-Experts architecture for enhanced loco-manipulation in complex scenes.

Findings

01

Demonstrates superior stability and terrain traversability in simulation and real-world tests.

02

Achieves higher command tracking precision compared to baselines.

03

Validates effectiveness on the Unitree G1 humanoid robot.

Abstract

Humanoid robots hold great potential for diverse interactions and daily service tasks within human-centered environments, necessitating controllers that seamlessly integrate precise locomotion with dexterous manipulation. However, most existing whole-body controllers lack exteroceptive awareness of the surrounding environment, rendering them insufficient for stable task execution in complex, unstructured scenarios.To address this challenge, we propose PILOT, a unified single-stage reinforcement learning (RL) framework tailored for perceptive loco-manipulation, which synergizes perceptive locomotion and expansive whole-body control within a single policy. To enhance terrain awareness and ensure precise foot placement, we design a cross-modal context encoder that fuses prediction-based proprioceptive features with attention-based perceptive representations. Furthermore, we introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Locomotion and Control · Human Pose and Action Recognition · Muscle activation and electromyography studies