TL;DR
CLAW is a scalable pipeline that generates physically feasible, language-annotated whole-body motion data for humanoid robots, combining motion primitives, real-time interfaces, and natural language annotations.
Contribution
It introduces a novel system for scalable, physically grounded motion-language data generation using motion primitives and natural language templates.
Findings
Generated diverse, physically feasible motion datasets for humanoid robots.
Provided real-time interfaces for exploratory data collection.
Made the system publicly available for the research community.
Abstract
Training language-conditioned whole-body controllers for humanoid robots demands large-scale motion-language datasets. Existing approaches based on motion capture are costly and limited in diversity, while text-to-motion generative models produce purely kinematic outputs that are not guaranteed to be physically feasible. We present CLAW, a pipeline for scalable generation of language-annotated whole-body motion data for the Unitree G1 humanoid robot. CLAW composes motion primitives from a kinematic planner, parameterized by movement, heading, speed, pelvis height, and duration, and provides two browser-based interfaces--a real-time keyboard mode and a timeline-based sequence editor--for exploratory and batch data collection. A low-level controller tracks these references in MuJoCo simulation, yielding physically grounded trajectories. In parallel, a template-based engine generates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
