Improving Generalization in Reinforcement Learning with Mixture   Regularization

Kaixin Wang; Bingyi Kang; Jie Shao; Jiashi Feng

arXiv:2010.10814·cs.LG·October 22, 2020·46 cites

Improving Generalization in Reinforcement Learning with Mixture Regularization

Kaixin Wang, Bingyi Kang, Jie Shao, Jiashi Feng

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces mixreg, a simple data augmentation method for reinforcement learning that combines observations from different environments to improve generalization and outperforms existing methods on the Procgen benchmark.

Contribution

The paper proposes mixreg, a novel mixture regularization approach that enhances data diversity and smoothness in RL training, leading to better generalization across unseen environments.

Findings

01

Mixreg significantly outperforms baselines on Procgen benchmark.

02

It effectively increases data diversity and policy smoothness.

03

Applicable to both policy-based and value-based RL algorithms.

Abstract

Deep reinforcement learning (RL) agents trained in a limited set of environments tend to suffer overfitting and fail to generalize to unseen testing environments. To improve their generalizability, data augmentation approaches (e.g. cutout and random convolution) are previously explored to increase the data diversity. However, we find these approaches only locally perturb the observations regardless of the training environments, showing limited effectiveness on enhancing the data diversity and the generalization performance. In this work, we introduce a simple approach, named mixreg, which trains agents on a mixture of observations from different training environments and imposes linearity constraints on the observation interpolations and the supervision (e.g. associated reward) interpolations. Mixreg increases the data diversity more effectively and helps learn smoother policies. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Improving Generalization in Reinforcement Learning with Mixture Regularization· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms · Adaptive Dynamic Programming Control

MethodsCutout