Multi-trainer Interactive Reinforcement Learning System

Zhaori Guo; Timothy J. Norman; and Enrico H. Gerding

arXiv:2210.08050·cs.LG·October 18, 2022

Multi-trainer Interactive Reinforcement Learning System

Zhaori Guo, Timothy J. Norman, and Enrico H. Gerding

PDF

Open Access

TL;DR

This paper introduces a multi-trainer interactive reinforcement learning system that aggregates feedback from multiple imperfect trainers to improve agent training in reward-sparse environments, demonstrating superior accuracy and policy quality.

Contribution

The paper proposes a novel MTIRL system that effectively combines multiple trainers' binary feedback, enhancing training reliability and policy optimality in reinforcement learning.

Findings

01

Aggregation method outperforms majority, weighted, and Bayesian voting.

02

MTIRL achieves higher feedback accuracy.

03

Policy trained with review model is closer to optimal.

Abstract

Interactive reinforcement learning can effectively facilitate the agent training via human feedback. However, such methods often require the human teacher to know what is the correct action that the agent should take. In other words, if the human teacher is not always reliable, then it will not be consistently able to guide the agent through its training. In this paper, we propose a more effective interactive reinforcement learning system by introducing multiple trainers, namely Multi-Trainer Interactive Reinforcement Learning (MTIRL), which could aggregate the binary feedback from multiple non-perfect trainers into a more reliable reward for an agent training in a reward-sparse environment. In particular, our trainer feedback aggregation experiments show that our aggregation method has the best accuracy when compared with the majority voting, the weighted voting, and the Bayesian…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · Evolutionary Algorithms and Applications