The Least Restriction for Offline Reinforcement Learning

Zizhou Su

arXiv:2107.01757·cs.LG·July 6, 2021

The Least Restriction for Offline Reinforcement Learning

Zizhou Su

PDF

Open Access

TL;DR

This paper introduces the Least Restriction framework for offline reinforcement learning, which minimizes constraints on action selection to improve stability and learning effectiveness from fixed datasets.

Contribution

The paper proposes a novel offline RL framework that reduces restrictions on action choices, addressing bootstrapping errors and enhancing learning from offline data.

Findings

01

LR can learn robustly from various offline datasets

02

LR outperforms previous methods on control tasks

03

LR avoids out-of-distribution actions effectively

Abstract

Many practical applications of reinforcement learning (RL) constrain the agent to learn from a fixed offline dataset of logged interactions, which has already been gathered, without offering further possibility for data collection. However, commonly used off-policy RL algorithms, such as the Deep Q Network and the Deep Deterministic Policy Gradient, are incapable of learning without data correlated to the distribution under the current policy, making them ineffective for this offline setting. As the first step towards useful offline RL algorithms, we analysis the reason of instability in standard off-policy RL algorithms. It is due to the bootstrapping error. The key to avoiding this error, is ensuring that the agent's action space does not go out of the fixed offline dataset. Based on our consideration, a creative offline RL framework, the Least Restriction (LR), is proposed in this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques · Smart Grid Energy Management