TL;DR
This paper introduces the SPOT framework, which enhances reinforcement learning efficiency for long-horizon multi-step visual tasks and demonstrates successful sim-to-real transfer without additional fine-tuning.
Contribution
The paper presents a novel RL framework that explores within safety zones, learns about unsafe regions, and prioritizes experience to improve efficiency and enable direct sim-to-real transfer for complex tasks.
Findings
Achieved near-perfect success rates in simulated block-stacking and row-making tasks.
Improved training efficiency by over 30% in action count.
Demonstrated successful real-world transfer with no additional fine-tuning.
Abstract
Current Reinforcement Learning (RL) algorithms struggle with long-horizon tasks where time can be wasted exploring dead ends and task progress may be easily reversed. We develop the SPOT framework, which explores within action safety zones, learns about unsafe regions without exploring them, and prioritizes experiences that reverse earlier progress to learn with remarkable efficiency. The SPOT framework successfully completes simulated trials of a variety of tasks, improving a baseline trial success rate from 13% to 100% when stacking 4 cubes, from 13% to 99% when creating rows of 4 cubes, and from 84% to 95% when clearing toys arranged in adversarial patterns. Efficiency with respect to actions per trial typically improves by 30% or more, while training takes just 1-20k actions, depending on the task. Furthermore, we demonstrate direct sim to real transfer. We are able to create…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
