Learning Synthetic Environments and Reward Networks for Reinforcement   Learning

Fabio Ferreira; Thomas Nierhoff; Andreas Saelinger; Frank; Hutter

arXiv:2202.02790·cs.LG·February 8, 2022

Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Fabio Ferreira, Thomas Nierhoff, Andreas Saelinger, Frank, Hutter

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Synthetic Environments and Reward Networks as neural network-based proxy models for training reinforcement learning agents, enabling efficient training with fewer real environment interactions and robust transferability.

Contribution

It proposes a novel bi-level optimization framework to evolve proxy environments and reward models, improving RL training efficiency and transferability.

Findings

01

SE proxies reduce real environment interactions for training

02

Agents trained on SEs perform comparably to those trained on real environments

03

SEs are robust to hyperparameter changes and transfer to unseen agents

Abstract

We introduce Synthetic Environments (SEs) and Reward Networks (RNs), represented by neural networks, as proxy environment models for training Reinforcement Learning (RL) agents. We show that an agent, after being trained exclusively on the SE, is able to solve the corresponding real environment. While an SE acts as a full proxy to a real environment by learning about its state dynamics and rewards, an RN is a partial proxy that learns to augment or replace rewards. We use bi-level optimization to evolve SEs and RNs: the inner loop trains the RL agent, and the outer loop trains the parameters of the SE / RN via an evolution strategy. We evaluate our proposed new concept on a broad range of RL algorithms and classic control environments. In a one-to-one comparison, learning an SE proxy requires more interactions with the real environment than training agents only on the real environment.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

automl/learning_environments
pytorchOfficial

Videos

Learning Synthetic Environments and Reward Networks for Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Neural Networks and Applications · Data Stream Mining Techniques