Embodied Learning of Reward for Musculoskeletal Control with Vision Language Models

Saraswati Soedarmadji; Yunyue Wei; Chen Zhang; Yisong Yue; Yanan Sui

arXiv:2512.23077·cs.RO·January 27, 2026

Embodied Learning of Reward for Musculoskeletal Control with Vision Language Models

Saraswati Soedarmadji, Yunyue Wei, Chen Zhang, Yisong Yue, Yanan Sui

PDF

Open Access

TL;DR

This paper presents MoVLR, a framework that uses vision-language models to automatically discover and refine reward functions for complex musculoskeletal control tasks, bridging high-level goals and low-level control strategies.

Contribution

Introduces MoVLR, a novel method leveraging vision-language models to iteratively learn reward functions from natural language and visual feedback for high-dimensional motor control.

Findings

01

Effective reward discovery for musculoskeletal control

02

Aligns control policies with natural language goals

03

Enables structured guidance for embodied learning

Abstract

Discovering effective reward functions remains a fundamental challenge in motor control of high-dimensional musculoskeletal systems. While humans can describe movement goals explicitly such as "walking forward with an upright posture," the underlying control strategies that realize these goals are largely implicit, making it difficult to directly design rewards from high-level goals and natural language descriptions. We introduce Motion from Vision-Language Representation (MoVLR), a framework that leverages vision-language models (VLMs) to bridge the gap between goal specification and movement control. Rather than relying on handcrafted rewards, MoVLR iteratively explores the reward space through iterative interaction between control optimization and VLM feedback, aligning control policies with physically coordinated behaviors. Our approach transforms language and vision-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Motor Control and Adaptation · Action Observation and Synchronization