Using Offline Data to Speed Up Reinforcement Learning in Procedurally   Generated Environments

Alain Andres; Lukas Sch\"afer; Stefano V.Albrecht; Javier Del Ser

arXiv:2304.09825·cs.LG·December 10, 2024·1 cites

Using Offline Data to Speed Up Reinforcement Learning in Procedurally Generated Environments

Alain Andres, Lukas Sch\"afer, Stefano V.Albrecht, Javier Del Ser

PDF

Open Access 1 Repo

TL;DR

This paper investigates how offline data, through imitation learning, can enhance sample efficiency in reinforcement learning within procedurally generated environments, demonstrating significant improvements with minimal offline trajectories.

Contribution

The study shows that offline imitation learning, used for pre-training or concurrently with online RL, significantly boosts sample efficiency and can achieve optimal policies with very few offline trajectories.

Findings

01

Offline IL improves sample efficiency in RL tasks.

02

Pre-training with just two trajectories can enable learning of optimal policies.

03

Concurrent IL during RL training consistently enhances convergence to optimal policies.

Abstract

One of the key challenges of Reinforcement Learning (RL) is the ability of agents to generalise their learned policy to unseen settings. Moreover, training RL agents requires large numbers of interactions with the environment. Motivated by the recent success of Offline RL and Imitation Learning (IL), we conduct a study to investigate whether agents can leverage offline data in the form of trajectories to improve the sample-efficiency in procedurally generated environments. We consider two settings of using IL from offline data for RL: (1) pre-training a policy before online RL training and (2) concurrently training a policy with online RL and IL from offline data. We analyse the impact of the quality (optimality of trajectories) and diversity (number of trajectories and covered level) of available offline trajectories on the effectiveness of both approaches. Across four well-known…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uoe-agents/imitation-learning-pcg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · Auction Theory and Applications