Language Movement Primitives: Grounding Language Models in Robot Motion
Yinlong Dai, Benjamin A. Christie, Daniel J. Evans, Dylan P. Losey, and Simon Stepputtis

TL;DR
This paper introduces Language Movement Primitives (LMPs), a framework that connects vision-language models with robot motion control using Dynamic Movement Primitives, enabling zero-shot manipulation with high success rates.
Contribution
The paper presents LMPs, a novel approach that grounds language model reasoning in interpretable motion primitives for zero-shot robotic manipulation.
Findings
LMP achieves 80% success across 20 tasks.
LMP outperforms baseline with 31% success.
Effective zero-shot manipulation without in-domain fine-tuning.
Abstract
Enabling robots to perform novel manipulation tasks from natural language instructions remains a fundamental challenge in robotics, despite significant progress in generalized problem solving with foundational models. Large vision and language models (VLMs) are capable of processing high-dimensional input data for visual scene and language understanding, as well as decomposing tasks into a sequence of logical steps; however, they struggle to ground those steps in embodied robot motion. On the other hand, robotics foundation models output action commands, but require in-domain fine-tuning or experience before they are able to perform novel tasks successfully. At its core, there still remains the fundamental challenge of connecting abstract task reasoning with low-level motion control. To address this disconnect, we propose Language Movement Primitives (LMPs), a framework that grounds VLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Social Robot Interaction and HRI
