Language Conditioned Multi-Finger Dexterous Manipulation Enabled by Physical Compliance and Switching of Controllers

Cheng Pan; Kai Junge; Benhui Dai; Qinghua Guan; Josie Hughes

arXiv:2410.14022·cs.RO·May 12, 2026·2 cites

Language Conditioned Multi-Finger Dexterous Manipulation Enabled by Physical Compliance and Switching of Controllers

Cheng Pan, Kai Junge, Benhui Dai, Qinghua Guan, Josie Hughes

PDF

TL;DR

This paper introduces a novel switching controller that combines high-level vision-language reasoning with low-level dexterous manipulation on a compliant robotic hand, enhancing robustness and adaptability.

Contribution

It presents a new method integrating high-level VLA models with lightweight dexterous policies via event-driven switching, enabling scalable, cross-embodiment dexterity.

Findings

01

Hardware compliance improves contact stability and disturbance adaptation.

02

The approach enables language-conditioned dexterous tasks with minimal retraining.

03

Modularity allows adaptation to new skills and hardware without retraining the VLA.

Abstract

Human dexterity arises from combining high-level task reasoning with finger-level dexterity control and physical compliance at the muscle and skin layers. In robotics, large Vision-Language-Action (VLA) models demonstrate text-conditioned high-level planning across diverse manipulation tasks, typically using pincher grippers. Smaller imitation-learning policies, conversely, show success in dexterous tasks using higher degree-of-freedom (DoF) grippers, but only for limited-scope tasks. However, few approaches combine high-level reasoning with dexterous, robust low-level control, which requires both intelligent control and compliant robot design. We propose a method inspired by the two-channel hypothesis of human motor control that combines these capabilities using a switching controller integrating high-level VLAs and smaller control models. Coordination between the two channels is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.