Solving Rubik's Cube with a Robot Hand

OpenAI; Ilge Akkaya; Marcin Andrychowicz; Maciek Chociej; Mateusz; Litwin; Bob McGrew; Arthur Petron; Alex Paino; Matthias Plappert; Glenn; Powell; Raphael Ribas; Jonas Schneider; Nikolas Tezak; Jerry Tworek; Peter; Welinder; Lilian Weng; Qiming Yuan; Wojciech Zaremba; Lei Zhang

arXiv:1910.07113·cs.LG·October 17, 2019·632 cites

Solving Rubik's Cube with a Robot Hand

OpenAI, Ilge Akkaya, Marcin Andrychowicz, Maciek Chociej, Mateusz, Litwin, Bob McGrew, Arthur Petron, Alex Paino, Matthias Plappert, Glenn, Powell, Raphael Ribas, Jonas Schneider, Nikolas Tezak, Jerry Tworek, Peter, Welinder, Lilian Weng, Qiming Yuan, Wojciech Zaremba, Lei Zhang

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper presents a novel simulation training method called automatic domain randomization (ADR) that enables a humanoid robot hand to solve a Rubik's Cube in the real world, demonstrating advanced sim2real transfer and emergent meta-learning.

Contribution

Introduction of ADR, a new algorithm for automatic environment randomization, combined with a custom robot platform, to successfully transfer complex manipulation skills from simulation to real robots.

Findings

01

Control policies trained with ADR transfer effectively to real robots.

02

Memory-augmented models show signs of meta-learning at test time.

03

The approach enables solving complex tasks like Rubik's Cube with a humanoid robot hand.

Abstract

We demonstrate that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot. This is made possible by two key components: a novel algorithm, which we call automatic domain randomization (ADR) and a robot platform built for machine learning. ADR automatically generates a distribution over randomized environments of ever-increasing difficulty. Control policies and vision state estimators trained with ADR exhibit vastly improved sim2real transfer. For control policies, memory-augmented models trained on an ADR-generated distribution of environments show clear signs of emergent meta-learning at test time. The combination of ADR with our custom robot platform allows us to solve a Rubik's cube with a humanoid robot hand, which involves both control and state estimation problems. Videos summarizing our results are available:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

OpenAI - Solving Rubik's Cube with a Robot Hand | RL paper explained· youtube

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning

MethodsTest