Student-Initiated Action Advising via Advice Novelty

Ercument Ilhan; Jeremy Gow; Diego Perez-Liebana

arXiv:2010.00381·cs.LG·October 12, 2021

Student-Initiated Action Advising via Advice Novelty

Ercument Ilhan, Jeremy Gow, Diego Perez-Liebana

PDF

1 Repo

TL;DR

This paper introduces a student-initiated action advising method in deep reinforcement learning that uses Random Network Distillation to measure advice novelty, addressing limitations of existing approaches and improving learning efficiency.

Contribution

The proposed algorithm employs RND for advice novelty and updates only advised states, enhancing robustness over existing state novelty-based methods.

Findings

01

Performs comparably to state-of-the-art methods in standard scenarios.

02

Shows significant advantages in scenarios where existing methods fail.

03

Effectively mitigates feedback lag issues in advice timing.

Abstract

Action advising is a budget-constrained knowledge exchange mechanism between teacher-student peers that can help tackle exploration and sample inefficiency problems in deep reinforcement learning (RL). Most recently, student-initiated techniques that utilise state novelty and uncertainty estimations have obtained promising results. However, the approaches built on these estimations have some potential weaknesses. First, they assume that the convergence of the student's RL model implies less need for advice. This can be misleading in scenarios with teacher absence early on where the student is likely to learn suboptimally by itself; yet also ignore the teacher's assistance later. Secondly, the delays between encountering states and having them to take effect in the RL model updates in presence of the experience replay dynamics cause a feedback lag in what the student actually needs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ercumentilhan/advice-novelty
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsExperience Replay