DADAgger: Disagreement-Augmented Dataset Aggregation

Akash Haridas; Karim Hamadeh; Samarendra Chandan Bindu Dash

arXiv:2301.01348·cs.LG·January 5, 2023

DADAgger: Disagreement-Augmented Dataset Aggregation

Akash Haridas, Karim Hamadeh, Samarendra Chandan Bindu Dash

PDF

Open Access

TL;DR

DADAgger improves imitation learning by selectively querying experts for out-of-distribution states, reducing sample queries while maintaining performance, and enabling efficient dataset creation.

Contribution

It introduces a novel OOD detection method for selective expert querying in imitation learning, reducing sample complexity compared to DAgger.

Findings

01

Achieves comparable performance to DAgger with fewer expert queries

02

Outperforms random sampling baseline in environment tests

03

Can build balanced datasets with minimal initial data

Abstract

DAgger is an imitation algorithm that aggregates its original datasets by querying the expert on all samples encountered during training. In order to reduce the number of samples queried, we propose a modification to DAgger, known as DADAgger, which only queries the expert for state-action pairs that are out of distribution (OOD). OOD states are identified by measuring the variance of the action predictions of an ensemble of models on each state, which we simulate using dropout. Testing on the Car Racing and Half Cheetah environments achieves comparable performance to DAgger but with reduced expert queries, and better performance than a random sampling baseline. We also show that our algorithm may be used to build efficient, well-balanced training datasets by running with no initial data and only querying the expert to resolve uncertainty.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Time Series Analysis and Forecasting