TL;DR
This paper presents a novel bimanual robot framework for performing Chinese stir-fry, using a decoupled learning approach and a Structured-Transformer model to coordinate dual-arm movements with visual feedback.
Contribution
It introduces a decoupled learning framework and a Structured-Transformer model for bimanual manipulation of deformable objects like stir-fry.
Findings
Successfully implemented on a real Panda robot
Achieved coordinated bimanual stir-fry motion
Framework adaptable to other deformable objects
Abstract
This letter describes an approach to achieve well-known Chinese cooking art stir-fry on a bimanual robot system. Stir-fry requires a sequence of highly dynamic coordinated movements, which is usually difficult to learn for a chef, let alone transfer to robots. In this letter, we define a canonical stir-fry movement, and then propose a decoupled framework for learning this deformable object manipulation from human demonstration. First, the dual arms of the robot are decoupled into different roles (a leader and follower) and learned with classical and neural network-based methods separately, then the bimanual task is transformed into a coordination problem. To obtain general bimanual coordination, we secondly propose a Graph and Transformer based model -- Structured-Transformer, to capture the spatio-temporal relationship between dual-arm movements. Finally, by adding visual feedback of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Byte Pair Encoding · Absolute Position Encodings · Residual Connection · Dense Connections · Label Smoothing · Dropout
