Words into Action: Learning Diverse Humanoid Robot Behaviors using Language Guided Iterative Motion Refinement
K. Niranjan Kumar, Irfan Essa, Sehoon Ha

TL;DR
This paper presents a method for training humanoid robot behaviors using natural language commands, combining language models and motion refinement to simplify controller development and enable diverse motion learning.
Contribution
It introduces a novel iterative refinement approach that leverages large language models and motion retargeting to efficiently learn diverse humanoid behaviors from natural language instructions.
Findings
Successfully learned diverse motions like walking, hopping, and kicking.
Achieved 3x faster learning speed compared to naive methods.
Validated on a simulated Digit humanoid robot.
Abstract
Humanoid robots are well suited for human habitats due to their morphological similarity, but developing controllers for them is a challenging task that involves multiple sub-problems, such as control, planning and perception. In this paper, we introduce a method to simplify controller design by enabling users to train and fine-tune robot control policies using natural language commands. We first learn a neural network policy that generates behaviors given a natural language command, such as "walk forward", by combining Large Language Models (LLMs), motion retargeting, and motion imitation. Based on the synthesized motion, we iteratively fine-tune by updating the text prompt and querying LLMs to find the best checkpoint associated with the closest motion in history. We validate our approach using a simulated Digit humanoid robot and demonstrate learning of diverse motions, such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Robotic Locomotion and Control · Human Motion and Animation
