A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang, Oriol Vinyals, Remi Munos, Samy Bengio

TL;DR
This paper systematically investigates overfitting in deep reinforcement learning, revealing that standard agents can overfit in various ways and that common stochastic techniques do not reliably prevent it, highlighting the need for better evaluation protocols.
Contribution
It provides a comprehensive analysis of overfitting phenomena in deep RL and discusses the implications for evaluation and generalization in RL agents.
Findings
Overfitting occurs in various forms in deep RL agents.
Common stochastic techniques do not reliably prevent overfitting.
Test performance can vary drastically despite optimal training rewards.
Abstract
Recent years have witnessed significant progresses in deep Reinforcement Learning (RL). Empowered with large scale neural networks, carefully designed architectures, novel training algorithms and massively parallel computing devices, researchers are able to attack many challenging RL problems. However, in machine learning, more training power comes with a potential risk of more overfitting. As deep RL techniques are being applied to critical problems such as healthcare and finance, it is important to understand the generalization behaviors of the trained agents. In this paper, we conduct a systematic study of standard RL agents and find that they could overfit in various ways. Moreover, overfitting could happen "robustly": commonly used techniques in RL that add stochasticity do not necessarily prevent or detect overfitting. In particular, the same agents and learning algorithms could…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Evolutionary Algorithms and Applications
