Constraint Sampling Reinforcement Learning: Incorporating Expertise For   Faster Learning

Tong Mu; Georgios Theocharous; David Arbour; Emma Brunskill

arXiv:2112.15221·cs.AI·January 3, 2022

Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

Tong Mu, Georgios Theocharous, David Arbour, Emma Brunskill

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Constraint Sampling Reinforcement Learning (CSRL), a practical method that incorporates human expertise as constraints to accelerate learning in complex RL tasks, demonstrating faster policy development across multiple environments.

Contribution

The paper presents CSRL, an algorithm that adaptively combines multiple constraints to improve learning speed and robustness, applicable to various base RL algorithms.

Findings

01

CSRL outperforms baselines in four diverse environments.

02

Incorporating constraints accelerates early learning performance.

03

The method is effective with different base RL algorithms.

Abstract

Online reinforcement learning (RL) algorithms are often difficult to deploy in complex human-facing applications as they may learn slowly and have poor early performance. To address this, we introduce a practical algorithm for incorporating human insight to speed learning. Our algorithm, Constraint Sampling Reinforcement Learning (CSRL), incorporates prior domain knowledge as constraints/restrictions on the RL policy. It takes in multiple potential policy constraints to maintain robustness to misspecification of individual constraints while leveraging helpful ones to learn quickly. Given a base RL learning algorithm (ex. UCRL, DQN, Rainbow) we propose an upper confidence with elimination scheme that leverages the relationship between the constraints, and their observed performance, to adaptively switch among them. We instantiate our algorithm with DQN-type algorithms and UCRL as base…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

StanfordAI4HI/CSRL
pytorchOfficial

Videos

Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Dense Connections · Convolution · Q-Learning · Deep Q-Network · Balanced Selection