MorFiC: Fixing Value Miscalibration for Zero-Shot Quadruped Transfer
Prakhar Mishra, Amir Hossain Raj, Xuesu Xiao, and Dinesh Manocha

TL;DR
MorFiC introduces a morphology-aware critic conditioning method in reinforcement learning to enable zero-shot transfer of quadruped locomotion policies across different robot morphologies, improving stability and speed.
Contribution
This paper presents MorFiC, a novel approach that conditions the critic on robot morphology, enabling zero-shot transfer of locomotion policies across diverse quadruped robots.
Findings
Outperforms morphology-conditioned PPO baselines in speed and stability.
Reduces value-prediction error variance across morphologies.
Successfully deploys on real robots without fine-tuning.
Abstract
Generalizing learned locomotion policies across quadrupedal robots with different morphologies remain a challenge. Policies trained on a single robot often break when deployed on embodiments with different mass distributions, kinematics, joint limits, or actuation constraints, forcing per robot retraining. We present MorFiC, a reinforcement learning approach for zero-shot cross-morphology locomotion using a single shared policy. MorFiC resolves a key failure mode in multi-morphology actor-critic training: a shared critic tends to average incompatible value targets across embodiments, yielding miscalibrated advantages. To address this, MorFiC conditions the critic via morphology-aware modulation driven by robot physical and control parameters, generating morphology-specific value estimates within a shared network. Trained with a single source robot with morphology randomization in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics · Robot Manipulation and Learning
