TextOp: Real-time Interactive Text-Driven Humanoid Robot Motion Generation and Control
Weiji Xie, Jiakun Zheng, Jinrui Han, Jiyuan Shi, Weinan Zhang, Chenjia Bai, Xuelong Li

TL;DR
TextOp introduces a real-time, text-driven humanoid robot control framework that enables flexible, interactive, and smooth execution of diverse motions through a two-level architecture combining motion generation and tracking.
Contribution
It presents a novel two-level architecture for real-time, text-driven humanoid motion control supporting interactive command modification and diverse behaviors.
Findings
Supports streaming language commands and on-the-fly modifications.
Enables smooth transitions across multiple behaviors like dancing and jumping.
Demonstrates instant responsiveness and precise control on real robots.
Abstract
Recent advances in humanoid whole-body motion tracking have enabled the execution of diverse and highly coordinated motions on real hardware. However, existing controllers are commonly driven either by predefined motion trajectories, which offer limited flexibility when user intent changes, or by continuous human teleoperation, which requires constant human involvement and limits autonomy. This work addresses the problem of how to drive a universal humanoid controller in a real-time and interactive manner. We present TextOp, a real-time text-driven humanoid motion generation and control framework that supports streaming language commands and on-the-fly instruction modification during execution. TextOp adopts a two-level architecture in which a high-level autoregressive motion diffusion model continuously generates short-horizon kinematic trajectories conditioned on the current text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Human Motion and Animation · Robot Manipulation and Learning
