RadGrad: Active learning with loss gradients

Paul Budnarain; Renato Ferreira Pinto Junior; Ilan Kogan

arXiv:1906.07838·cs.LG·June 20, 2019·1 cites

RadGrad: Active learning with loss gradients

Paul Budnarain, Renato Ferreira Pinto Junior, Ilan Kogan

PDF

Open Access

TL;DR

RadGrad is an active learning algorithm for sequential decision problems that reduces expert queries by predicting agent error and risk, maintaining performance while lowering costs, but faces challenges in complex environments.

Contribution

Introduces RadGrad, an active learning method that selectively queries experts based on error and risk predictions, improving efficiency over existing algorithms.

Findings

01

RadGrad matches or exceeds DAgger in simple environments.

02

Complex environments challenge RadGrad and other safety-aware algorithms.

03

RadGrad reduces expert queries significantly while maintaining performance.

Abstract

Solving sequential decision prediction problems, including those in imitation learning settings, requires mitigating the problem of covariate shift. The standard approach, DAgger, relies on capturing expert behaviour in all states that the agent reaches. In real-world settings, querying an expert is costly. We propose a new active learning algorithm that selectively queries the expert, based on both a prediction of agent error and a proxy for agent risk, that maintains the performance of unrestrained expert querying systems while substantially reducing the number of expert queries made. We show that our approach, RadGrad, has the potential to improve upon existing safety-aware algorithms, and matches or exceeds the performance of DAgger and variants (i.e., SafeDAgger) in one simulated environment. However, we also find that a more complex environment poses challenges not only to our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research