A Study on Overfitting in Deep Reinforcement Learning

Chiyuan Zhang; Oriol Vinyals; Remi Munos; Samy Bengio

arXiv:1804.06893·cs.LG·April 23, 2018·237 cites

A Study on Overfitting in Deep Reinforcement Learning

Chiyuan Zhang, Oriol Vinyals, Remi Munos, Samy Bengio

PDF

Open Access 1 Repo

TL;DR

This paper systematically investigates overfitting in deep reinforcement learning, revealing that standard agents can overfit in various ways and that common stochastic techniques do not reliably prevent it, highlighting the need for better evaluation protocols.

Contribution

It provides a comprehensive analysis of overfitting phenomena in deep RL and discusses the implications for evaluation and generalization in RL agents.

Findings

01

Overfitting occurs in various forms in deep RL agents.

02

Common stochastic techniques do not reliably prevent overfitting.

03

Test performance can vary drastically despite optimal training rewards.

Abstract

Recent years have witnessed significant progresses in deep Reinforcement Learning (RL). Empowered with large scale neural networks, carefully designed architectures, novel training algorithms and massively parallel computing devices, researchers are able to attack many challenging RL problems. However, in machine learning, more training power comes with a potential risk of more overfitting. As deep RL techniques are being applied to critical problems such as healthcare and finance, it is important to understand the generalization behaviors of the trained agents. In this paper, we conduct a systematic study of standard RL agents and find that they could overfit in various ways. Moreover, overfitting could happen "robustly": commonly used techniques in RL that add stochasticity do not necessarily prevent or detect overfitting. In particular, the same agents and learning algorithms could…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oliviawl/image_classification_utkface
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Evolutionary Algorithms and Applications