Model-Based Regularization for Deep Reinforcement Learning with Transcoder Networks
Felix Leibfried, Peter Vrancx

TL;DR
This paper introduces a model-based regularization approach for deep reinforcement learning by augmenting DQNs with a transcoder network, which improves sample efficiency and performance across Atari games.
Contribution
It presents a novel regularization method that incorporates environment modeling into DQNs, enhancing learning signals and overall performance.
Findings
Achieved superior results over vanilla DQN on 20 Atari games.
Enhanced sample efficiency in deep reinforcement learning.
Demonstrated the effectiveness of model-based regularization in practice.
Abstract
This paper proposes a new optimization objective for value-based deep reinforcement learning. We extend conventional Deep Q-Networks (DQNs) by adding a model-learning component yielding a transcoder network. The prediction errors for the model are included in the basic DQN loss as additional regularizers. This augmented objective leads to a richer training signal that provides feedback at every time step. Moreover, because learning an environment model shares a common structure with the RL problem, we hypothesize that the resulting objective improves both sample efficiency and performance. We empirically confirm our hypothesis on a range of 20 games from the Atari benchmark attaining superior results over vanilla DQN without model-based regularization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Model Reduction and Neural Networks
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
