LHM-Humanoid: Learning a Unified Policy for Long-Horizon Humanoid Whole-Body Loco-Manipulation in Diverse Messy Environments

Haozhuo Zhang; Jingkai Sun; Michele Caprio; Jian Tang; Shanghang Zhang; Qiang Zhang; Wei Pan

arXiv:2508.16943·cs.RO·March 6, 2026

LHM-Humanoid: Learning a Unified Policy for Long-Horizon Humanoid Whole-Body Loco-Manipulation in Diverse Messy Environments

Haozhuo Zhang, Jingkai Sun, Michele Caprio, Jian Tang, Shanghang Zhang, Qiang Zhang, Wei Pan

PDF

TL;DR

LHM-Humanoid introduces a benchmark and learning framework enabling humanoid robots to perform complex, long-horizon loco-manipulation tasks in diverse, cluttered environments with a single unified policy.

Contribution

The paper presents a new benchmark, dataset, and a unified policy learning approach for humanoid loco-manipulation in varied scenes, emphasizing cross-scene generalization and end-to-end control.

Findings

01

Outperforms prior methods in unseen scenes

02

Demonstrates strong long-horizon robustness

03

Effective in diverse, cluttered environments

Abstract

We introduce LHM-Humanoid, a benchmark and learning framework for long-horizon whole-body humanoid loco-manipulation in diverse, cluttered scenes. In our setting, multiple objects are displaced from their intended locations and may obstruct navigation; a humanoid agent must repeatedly (i) walk to a target, (ii) pick it up with diverse whole-body postures under balance constraints, (iii) carry it while navigating around obstacles, and (iv) place it at a designated goal -- all within a single continuous episode and without any environment reset. This task simultaneously demands cross-scene generalization and unified one-policy control: layouts, obstacle arrangements, object category/mass/shape/color and object start/goal poses vary substantially even within a room category, requiring a single general policy that directly outputs actions rather than invoking pre-trained skill libraries.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.