Assessing Generalization in Deep Reinforcement Learning
Charles Packer, Katelyn Gao, Jernej Kos, Philipp Kr\"ahenb\"uhl,, Vladlen Koltun, Dawn Song

TL;DR
This paper introduces a benchmark and systematic study to evaluate the generalization capabilities of deep reinforcement learning algorithms across diverse environments, revealing that standard algorithms often outperform specialized generalization methods.
Contribution
It provides a controlled benchmark and experimental protocol for assessing generalization in deep RL, facilitating fair comparisons and progress in the field.
Findings
Vanilla deep RL algorithms outperform specialized generalization schemes.
The benchmark includes diverse environments for comprehensive evaluation.
Systematic empirical study highlights gaps and opportunities in current methods.
Abstract
Deep reinforcement learning (RL) has achieved breakthrough results on many tasks, but agents often fail to generalize beyond the environment they were trained in. As a result, deep RL algorithms that promote generalization are receiving increasing attention. However, works in this area use a wide variety of tasks and experimental setups for evaluation. The literature lacks a controlled assessment of the merits of different generalization schemes. Our aim is to catalyze community-wide progress on generalization in deep RL. To this end, we present a benchmark and experimental protocol, and conduct a systematic empirical study. Our framework contains a diverse set of environments, our methodology covers both in-distribution and out-of-distribution generalization, and our evaluation includes deep RL algorithms that specifically tackle generalization. Our key finding is that `vanilla' deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Mobile Crowdsensing and Crowdsourcing
