Unsupervised Perceptual Rewards for Imitation Learning
Pierre Sermanet, Kelvin Xu, Sergey Levine

TL;DR
This paper introduces an unsupervised method to automatically derive perceptual reward functions from few demonstrations using deep visual features, enabling reinforcement learning of complex robotic tasks without explicit sub-goal specification.
Contribution
The authors propose a novel approach that leverages deep models to infer intermediate task steps and create reward functions without manual engineering or explicit sub-goal definitions.
Findings
Successfully learned a door opening skill with a real robot from human demonstrations.
Achieved qualitative and quantitative evaluation against human-designed rewards.
Demonstrated applicability to complex manipulation tasks in real-world settings.
Abstract
Reward function design and exploration time are arguably the biggest obstacles to the deployment of reinforcement learning (RL) agents in the real world. In many real-world tasks, designing a reward function takes considerable hand engineering and often requires additional sensors to be installed just to measure whether the task has been executed successfully. Furthermore, many interesting tasks consist of multiple implicit intermediate steps that must be executed in sequence. Even when the final outcome can be measured, it does not necessarily provide feedback on these intermediate steps. To address these issues, we propose leveraging the abstraction power of intermediate visual representations learned by deep models to quickly infer perceptual reward functions from small numbers of demonstrations. We present a method that is able to identify key intermediate steps of a task from only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Neural and Behavioral Psychology Studies
