DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable   Feedback

Riku Arakawa; Sosuke Kobayashi; Yuya Unno; Yuta Tsuboi and; Shin-ichi Maeda

arXiv:1810.11748·cs.HC·October 30, 2018·48 cites

DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi and, Shin-ichi Maeda

PDF

Open Access 1 Repo

TL;DR

This paper introduces DQN-TAMER, a reinforcement learning method that effectively incorporates real-time human feedback, addressing exploration challenges and demonstrating improved performance in simulated and real-world maze tasks.

Contribution

It proposes DQN-TAMER, a novel RL algorithm that combines human feedback with traditional rewards, and thoroughly models various human feedback scenarios.

Findings

01

DQN-TAMER outperforms baseline algorithms in Maze and Taxi environments.

02

The method effectively integrates human feedback with distant rewards.

03

Real-world application demonstrates facial expression recognition as feedback.

Abstract

Exploration has been one of the greatest challenges in reinforcement learning (RL), which is a large obstacle in the application of RL to robotics. Even with state-of-the-art RL algorithms, building a well-learned agent often requires too many trials, mainly due to the difficulty of matching its actions with rewards in the distant future. A remedy for this is to train an agent with real-time feedback from a human observer who immediately gives rewards for some actions. This study tackles a series of challenges for introducing such a human-in-the-loop RL scheme. The first contribution of this work is our experiments with a precisely modeled human observer: binary, delay, stochasticity, unsustainability, and natural reaction. We also propose an RL method called DQN-TAMER, which efficiently uses both human feedback and distant rewards. We find that DQN-TAMER agents outperform their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JulienDesvergnes/human-reinforcement-learning
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · EEG and Brain-Computer Interfaces