CUBic: Coordinated Unified Bimanual Perception and Control Framework
Xingyu Wang, Pengxiang Ding, Jingkai Xu, Donglin Wang, Zhaoxin Fan

TL;DR
CUBic introduces a unified framework for bimanual robot control that learns shared perceptual representations, enabling intrinsic coordination without hand-crafted coupling, and demonstrates superior performance on RoboTwin tasks.
Contribution
It reformulates bimanual coordination as a unified perceptual modeling problem using shared tokenized representations, advancing beyond existing decoupled or strongly coupled methods.
Findings
CUBic outperforms standard baselines on RoboTwin benchmark.
Achieves higher coordination accuracy in bimanual tasks.
Improves task success rates over state-of-the-art visuomotor methods.
Abstract
Recent advances in visuomotor policy learning have enabled robots to perform control directly from visual inputs. Yet, extending such end-to-end learning from single-arm to bimanual manipulation remains challenging due to the need for both independent perception and coordinated interaction between arms. Existing methods typically favor one side -- either decoupling the two arms to avoid interference or enforcing strong cross-arm coupling for coordination -- thus lacking a unified treatment. We propose CUBic, a Coordinated and Unified framework for Bimanual perception and control that reformulates bimanual coordination as a unified perceptual modeling problem. CUBic learns a shared tokenized representation bridging perception and control, where independence and coordination emerge intrinsically from structure rather than from hand-crafted coupling. Our approach integrates three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
