End-to-End Robotic Reinforcement Learning without Reward Engineering
Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey, Levine

TL;DR
This paper introduces a method enabling robots to learn manipulation tasks directly from images without manually designing reward functions, using minimal supervision through active querying of success labels, thus simplifying real-world reinforcement learning.
Contribution
The authors propose a reward-free reinforcement learning approach that learns from few success examples and active queries, eliminating the need for manual reward engineering in robotic tasks.
Findings
Successfully learned object arrangement, book placement, and cloth draping from images.
Achieved effective learning within 1-4 hours of real-world interaction.
Reduced supervision by requiring labels for only a small fraction of states.
Abstract
The combination of deep neural network models and reinforcement learning algorithms can make it possible to learn policies for robotic behaviors that directly read in raw sensory inputs, such as camera images, effectively subsuming both estimation and control into one model. However, real-world applications of reinforcement learning must specify the goal of the task by means of a manually programmed reward function, which in practice requires either designing the very same perception pipeline that end-to-end reinforcement learning promises to avoid, or else instrumenting the environment with additional sensors to determine if the task has been performed successfully. In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
