Comparing Human-Centric and Robot-Centric Sampling for Robot Deep   Learning from Demonstrations

Michael Laskey; Caleb Chuck; Jonathan Lee; Jeffrey Mahler; Sanjay; Krishnan; Kevin Jamieson; Anca Dragan; Ken Goldberg

arXiv:1610.00850·cs.RO·March 30, 2017·2 cites

Comparing Human-Centric and Robot-Centric Sampling for Robot Deep Learning from Demonstrations

Michael Laskey, Caleb Chuck, Jonathan Lee, Jeffrey Mahler, Sanjay, Krishnan, Kevin Jamieson, Anca Dragan, Ken Goldberg

PDF

Open Access

TL;DR

This paper compares human-centric and robot-centric sampling methods for robot learning from demonstrations, analyzing their effectiveness and limitations in different models and tasks, including theoretical guarantees.

Contribution

It provides a comprehensive comparison of HC and RC sampling methods, highlighting conditions where HC guarantees convergence and RC may fail, especially with deep learning models.

Findings

01

RC outperforms HC with linear SVMs in simulation.

02

Deep models show no significant advantage of RC over HC.

03

Human-provided corrections in RC can be highly inconsistent.

Abstract

Motivated by recent advances in Deep Learning for robot control, this paper considers two learning algorithms in terms of how they acquire demonstrations. "Human-Centric" (HC) sampling is the standard supervised learning algorithm, where a human supervisor demonstrates the task by teleoperating the robot to provide trajectories consisting of state-control pairs. "Robot-Centric" (RC) sampling is an increasingly popular alternative used in algorithms such as DAgger, where a human supervisor observes the robot executing a learned policy and provides corrective control labels for each state visited. RC sampling can be challenging for human supervisors and prone to mislabeling. RC sampling can also induce error in policy performance because it repeatedly visits areas of the state space that are harder to learn. Although policies learned with RC sampling can be superior to HC sampling for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Machine Learning and Algorithms