Model-Based Regularization for Deep Reinforcement Learning with   Transcoder Networks

Felix Leibfried; Peter Vrancx

arXiv:1809.01906·cs.LG·November 21, 2018·6 cites

Model-Based Regularization for Deep Reinforcement Learning with Transcoder Networks

Felix Leibfried, Peter Vrancx

PDF

Open Access

TL;DR

This paper introduces a model-based regularization approach for deep reinforcement learning by augmenting DQNs with a transcoder network, which improves sample efficiency and performance across Atari games.

Contribution

It presents a novel regularization method that incorporates environment modeling into DQNs, enhancing learning signals and overall performance.

Findings

01

Achieved superior results over vanilla DQN on 20 Atari games.

02

Enhanced sample efficiency in deep reinforcement learning.

03

Demonstrated the effectiveness of model-based regularization in practice.

Abstract

This paper proposes a new optimization objective for value-based deep reinforcement learning. We extend conventional Deep Q-Networks (DQNs) by adding a model-learning component yielding a transcoder network. The prediction errors for the model are included in the basic DQN loss as additional regularizers. This augmented objective leads to a richer training signal that provides feedback at every time step. Moreover, because learning an environment model shares a common structure with the RL problem, we hypothesize that the resulting objective improves both sample efficiency and performance. We empirically confirm our hypothesis on a range of 20 games from the Atari benchmark attaining superior results over vanilla DQN without model-based regularization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Model Reduction and Neural Networks

MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network