Offline RL With Resource Constrained Online Deployment

Jayanth Reddy Regatti; Aniket Anand Deshmukh; Frank Cheng; Young Hun; Jung; Abhishek Gupta; Urun Dogan

arXiv:2110.03165·cs.LG·December 9, 2021·1 cites

Offline RL With Resource Constrained Online Deployment

Jayanth Reddy Regatti, Aniket Anand Deshmukh, Frank Cheng, Young Hun, Jung, Abhishek Gupta, Urun Dogan

PDF

Open Access

TL;DR

This paper introduces a new resource-constrained offline reinforcement learning setting where policies trained on fully processed offline data are transferred to online environments with limited features, addressing a novel challenge in RL deployment.

Contribution

The paper formalizes the resource-constrained offline RL problem, proposes a transfer algorithm from a fully-featured offline dataset to resource-limited online policies, and introduces the RC-D4RL benchmark for evaluation.

Findings

01

The transfer algorithm improves performance over baseline methods.

02

Policies trained with the proposed method perform better in resource-constrained online settings.

03

The RC-D4RL benchmark effectively captures the resource constraint challenge.

Abstract

Offline reinforcement learning is used to train policies in scenarios where real-time access to the environment is expensive or impossible. As a natural consequence of these harsh conditions, an agent may lack the resources to fully observe the online environment before taking an action. We dub this situation the resource-constrained setting. This leads to situations where the offline dataset (available for training) can contain fully processed features (using powerful language models, image models, complex sensors, etc.) which are not available when actions are actually taken online. This disconnect leads to an interesting and unexplored problem in offline RL: Is it possible to use a richly processed offline dataset to train a policy which has access to fewer features in the online environment? In this work, we introduce and formalize this novel resource-constrained problem setting. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning