GR-Dexter Technical Report

Ruoshi Wen; Guangzeng Chen; Zhongren Cui; Min Du; Yang Gou; Zhigang Han; Liqun Huang; Mingyu Lei; Yunfei Li; Zhuohang Li; Wenlei Liu; Yuxiao Liu; Xiao Ma; Hao Niu; Yutao Ouyang; Zeyu Ren; Haixin Shi; Wei Xu; Haoxiang Zhang; Jiajun Zhang; Xiao Zhang; Liwei Zheng; Weiheng Zhong; Yifei Zhou; Zhengming Zhu; Hang Li

arXiv:2512.24210·cs.RO·January 12, 2026

GR-Dexter Technical Report

Ruoshi Wen, Guangzeng Chen, Zhongren Cui, Min Du, Yang Gou, Zhigang Han, Liqun Huang, Mingyu Lei, Yunfei Li, Zhuohang Li, Wenlei Liu, Yuxiao Liu, Xiao Ma, Hao Niu, Yutao Ouyang, Zeyu Ren, Haixin Shi, Wei Xu, Haoxiang Zhang, Jiajun Zhang, Xiao Zhang, Liwei Zheng, Weiheng Zhong

PDF

Open Access

TL;DR

GR-Dexter introduces a comprehensive framework combining hardware, data collection, and training methods to enable vision-language conditioned manipulation on a bimanual dexterous robot, addressing challenges of high DoF and occlusions.

Contribution

It presents a novel hardware design, a teleoperation-based data collection system, and a training recipe leveraging large-scale datasets for generalist dexterous manipulation.

Findings

01

Achieves strong in-domain performance on real-world tasks.

02

Demonstrates robustness to unseen objects and instructions.

03

Enables long-horizon and generalizable pick-and-place tasks.

Abstract

Vision-language-action (VLA) models have enabled language-conditioned, long-horizon robot manipulation, but most existing systems are limited to grippers. Scaling VLA policies to bimanual robots with high degree-of-freedom (DoF) dexterous hands remains challenging due to the expanded action space, frequent hand-object occlusions, and the cost of collecting real-robot data. We present GR-Dexter, a holistic hardware-model-data framework for VLA-based generalist manipulation on a bimanual dexterous-hand robot. Our approach combines the design of a compact 21-DoF robotic hand, an intuitive bimanual teleoperation system for real-robot data collection, and a training recipe that leverages teleoperated robot trajectories together with large-scale vision-language and carefully curated cross-embodiment datasets. Across real-world evaluations spanning long-horizon everyday manipulation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Hand Gesture Recognition Systems