CubeTR: Learning to Solve The Rubiks Cube Using Transformers
Mustafa Ebrahim Chasmai

TL;DR
CubeTR leverages transformer models to learn to solve the Rubik's Cube from arbitrary states, effectively handling sparse rewards and demonstrating potential for generalization to higher-dimensional puzzles.
Contribution
This work introduces CubeTR, a transformer-based reinforcement learning approach that solves the Rubik's Cube without human prior knowledge, addressing sparse reward challenges.
Findings
CubeTR successfully solves the Rubik's Cube from arbitrary states.
Solution lengths are comparable to expert human algorithms after move regularisation.
The approach demonstrates potential for generalizing to higher-dimensional puzzles.
Abstract
Since its first appearance, transformers have been successfully used in wide ranging domains from computer vision to natural language processing. Application of transformers in Reinforcement Learning by reformulating it as a sequence modelling problem was proposed only recently. Compared to other commonly explored reinforcement learning problems, the Rubiks cube poses a unique set of challenges. The Rubiks cube has a single solved state for quintillions of possible configurations which leads to extremely sparse rewards. The proposed model CubeTR attends to longer sequences of actions and addresses the problem of sparse rewards. CubeTR learns how to solve the Rubiks cube from arbitrary starting states without any human prior, and after move regularisation, the lengths of solutions generated by it are expected to be very close to those given by algorithms used by expert human solvers.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Bandit Algorithms Research
