Language Conditioned Multi-Finger Dexterous Manipulation Enabled by Physical Compliance and Switching of Controllers
Cheng Pan, Kai Junge, Benhui Dai, Qinghua Guan, Josie Hughes

TL;DR
This paper introduces a novel switching controller that combines high-level vision-language reasoning with low-level dexterous manipulation on a compliant robotic hand, enhancing robustness and adaptability.
Contribution
It presents a new method integrating high-level VLA models with lightweight dexterous policies via event-driven switching, enabling scalable, cross-embodiment dexterity.
Findings
Hardware compliance improves contact stability and disturbance adaptation.
The approach enables language-conditioned dexterous tasks with minimal retraining.
Modularity allows adaptation to new skills and hardware without retraining the VLA.
Abstract
Human dexterity arises from combining high-level task reasoning with finger-level dexterity control and physical compliance at the muscle and skin layers. In robotics, large Vision-Language-Action (VLA) models demonstrate text-conditioned high-level planning across diverse manipulation tasks, typically using pincher grippers. Smaller imitation-learning policies, conversely, show success in dexterous tasks using higher degree-of-freedom (DoF) grippers, but only for limited-scope tasks. However, few approaches combine high-level reasoning with dexterous, robust low-level control, which requires both intelligent control and compliant robot design. We propose a method inspired by the two-channel hypothesis of human motor control that combines these capabilities using a switching controller integrating high-level VLAs and smaller control models. Coordination between the two channels is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
