A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto; Shixiang Shane Gu

arXiv:2106.06860·cs.LG·December 6, 2021·164 cites

A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto, Shixiang Shane Gu

PDF

Open Access 5 Repos 1 Video

TL;DR

This paper presents a minimalist offline RL method that adds a behavior cloning term and data normalization to a standard RL algorithm, achieving competitive performance with less complexity and computational cost.

Contribution

It introduces a simple, effective offline RL approach by minimal modifications to existing algorithms, avoiding complex components and hyperparameters.

Findings

01

Matches state-of-the-art offline RL performance

02

Halves runtime compared to complex methods

03

Simplifies implementation and tuning

Abstract

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data. Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms take the approach of constraining or regularizing the policy with the actions contained in the dataset. Built on pre-existing RL algorithms, modifications to make an RL algorithm work offline comes at the cost of additional complexity. Offline RL algorithms introduce new hyperparameters and often leverage secondary components such as generative models, while adjusting the underlying RL algorithm. In this paper we aim to make a deep RL algorithm work while making minimal changes. We find that we can match the performance of state-of-the-art offline RL algorithms by simply adding a behavior cloning term to the policy update of an online RL algorithm and normalizing the data. The resulting algorithm is a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

A Minimalist Approach to Offline Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Smart Grid Energy Management