Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Samuel Ainsworth; Matt Barnes; Siddhartha Srinivasa

arXiv:1912.01649·cs.LG·December 5, 2019

Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Samuel Ainsworth, Matt Barnes, Siddhartha Srinivasa

PDF

Open Access 1 Repo

TL;DR

This paper introduces emergency stop mechanisms (e-stops) that leverage small relevant state subsets to improve reinforcement learning efficiency, reducing exploration needs and accelerating learning with minimal performance loss.

Contribution

The paper proposes a novel e-stop technique that enhances sample efficiency and convergence speed in RL by exploiting state space structure, with theoretical analysis and empirical validation.

Findings

01

E-stops significantly reduce exploration in RL tasks.

02

Empirical results show order-of-magnitude speedups.

03

Performance bounds are maintained with small sub-optimality.

Abstract

In many environments, only a relatively small subset of the complete state space is necessary in order to accomplish a given task. We develop a simple technique using emergency stops (e-stops) to exploit this phenomenon. Using e-stops significantly improves sample complexity by reducing the amount of required exploration, while retaining a performance bound that efficiently trades off the rate of convergence with a small asymptotic sub-optimality gap. We analyze the regret behavior of e-stops and present empirical results in discrete and continuous settings demonstrating that our reset mechanism can provide order-of-magnitude speedups on top of existing reinforcement learning methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samuela/e-stops
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Reinforcement Learning in Robotics