SkillVLA: Tackling Combinatorial Diversity in Dual-Arm Manipulation via Skill Reuse

Xuanran Zhai; Zekai Huang; Longyan Wu; Qianyou Zhao; Qiaojun Yu; Jieji Ren; Ce Hao; Harold Soh

arXiv:2603.03836·cs.RO·March 5, 2026

SkillVLA: Tackling Combinatorial Diversity in Dual-Arm Manipulation via Skill Reuse

Xuanran Zhai, Zekai Huang, Longyan Wu, Qianyou Zhao, Qiaojun Yu, Jieji Ren, Ce Hao, Harold Soh

PDF

Open Access

TL;DR

SkillVLA introduces a framework for dual-arm manipulation that enables skill reuse, significantly improving the ability to recombine learned skills for complex tasks and outperforming existing models in success rate.

Contribution

The paper presents SkillVLA, a novel framework that explicitly supports skill reuse in dual-arm manipulation, addressing the challenge of combinatorial diversity in vision-language-action models.

Findings

01

Success rate increased from 0% to 51%.

02

Improved skill composition for complex tasks.

03

Strong performance on cooperative and long-horizon tasks.

Abstract

Recent progress in vision-language-action (VLA) models has demonstrated strong potential for dual-arm manipulation, enabling complex behaviors and generalization to unseen environments. However, mainstream bimanual VLA formulations largely overlook the critical challenge of combinatorial diversity. Different pairings of single-arm behaviors can induce qualitatively distinct task behaviors, yet existing models do not explicitly account for this structure. We argue that effective bimanual VLAs should support skill reuse - the ability to recombine previously learned single-arm skills across novel left-right pairings - thereby avoiding the need to separately learn every possible combination. Current VLA designs entangle skills across arms, preventing such recomposition and limiting scalability. To address this limitation, we propose SkillVLA, a framework explicitly designed to enable skill…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Domain Adaptation and Few-Shot Learning