Time-Efficient Reward Learning via Visually Assisted Cluster Ranking

David Zhang; Micah Carroll; Andreea Bobu; Anca Dragan

arXiv:2212.00169·cs.LG·December 2, 2022

Time-Efficient Reward Learning via Visually Assisted Cluster Ranking

David Zhang, Micah Carroll, Andreea Bobu, Anca Dragan

PDF

Open Access

TL;DR

This paper introduces a visually assisted clustering method to improve reward learning efficiency by batching human comparisons, significantly reducing labeling time while enhancing agent performance in Mujoco tasks.

Contribution

It proposes a novel interactive GUI for state space labeling that leverages data visualization to batch human feedback, improving reward learning efficiency.

Findings

01

Increased reward learning efficiency with less human labeling time

02

Enhanced agent performance in Mujoco tasks

03

Effective use of visualization for state space labeling

Abstract

One of the most successful paradigms for reward learning uses human feedback in the form of comparisons. Although these methods hold promise, human comparison labeling is expensive and time consuming, constituting a major bottleneck to their broader applicability. Our insight is that we can greatly improve how effectively human time is used in these approaches by batching comparisons together, rather than having the human label each comparison individually. To do so, we leverage data dimensionality-reduction and visualization techniques to provide the human with a interactive GUI displaying the state space, in which the user can label subportions of the state space. Across some simple Mujoco tasks, we show that this high-level approach holds promise and is able to greatly increase the performance of the resulting agents, provided the same amount of human labeling time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Data Stream Mining Techniques