A Hierarchical Reinforcement Learning Method for Persistent   Time-Sensitive Tasks

Xiao Li; Calin Belta

arXiv:1606.06355·cs.AI·June 22, 2016·2 cites

A Hierarchical Reinforcement Learning Method for Persistent Time-Sensitive Tasks

Xiao Li, Calin Belta

PDF

Open Access

TL;DR

This paper introduces a hierarchical reinforcement learning approach using signal temporal logic and options to effectively learn policies for complex, persistent, and time-sensitive tasks, demonstrated through simulation.

Contribution

It combines STL with the options framework to address persistent and time-sensitive tasks in reinforcement learning, a novel integration for this problem.

Findings

01

The method learns satisfactory policies with few training cases.

02

The approach effectively handles complex persistent tasks.

03

Simulation results validate the method's efficiency.

Abstract

Reinforcement learning has been applied to many interesting problems such as the famous TD-gammon and the inverted helicopter flight. However, little effort has been put into developing methods to learn policies for complex persistent tasks and tasks that are time-sensitive. In this paper, we take a step towards solving this problem by using signal temporal logic (STL) as task specification, and taking advantage of the temporal abstraction feature that the options framework provide. We show via simulation that a relatively easy to implement algorithm that combines STL and options can learn a satisfactory policy with a small number of training cases

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics · Neural Networks and Reservoir Computing

MethodsDense Connections · Accumulating Eligibility Trace · Feedforward Network · TD Lambda · TD-Gammon