Knowledge Diversion for Efficient Morphology Control and Policy Transfer
Fu Feng, Ruixiao Shi, Yucheng Xie, Jianlu Shen, Jing Wang, Xin Geng

TL;DR
DivMorph introduces a modular Transformer-based framework that disentangles shared and task-specific knowledge, enabling scalable, efficient morphology control and transfer across heterogeneous agents with significant improvements in sample efficiency and model size.
Contribution
It proposes a novel knowledge diversion method that factorizes Transformer weights and uses dynamic gating for scalable, transferable morphology control.
Findings
3x improvement in sample efficiency for cross-task transfer
17x reduction in model size for deployment
State-of-the-art performance in morphology control
Abstract
Universal morphology control aims to learn a universal policy that generalizes across heterogeneous agent morphologies, with Transformer-based controllers emerging as a popular choice. However, such architectures incur substantial computational costs, resulting in high deployment overhead, and existing methods exhibit limited cross-task generalization, necessitating training from scratch for each new task. To this end, we propose \textbf{DivMorph}, a modular training paradigm that leverages knowledge diversion to learn decomposable controllers. DivMorph factorizes randomly initialized Transformer weights into factor units via SVD prior to training and employs dynamic soft gating to modulate these units based on task and morphology embeddings, separating them into shared \textit{learngenes} and morphology- and task-specific \textit{tailors}, thereby achieving knowledge disentanglement.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
