Safety-guaranteed Reinforcement Learning based on Multi-class Support   Vector Machine

Kwangyeon Kim; Akshita Gupta; Hong-Cheol Choi; Inseok Hwang

arXiv:2006.07446·cs.LG·June 16, 2020

Safety-guaranteed Reinforcement Learning based on Multi-class Support Vector Machine

Kwangyeon Kim, Akshita Gupta, Hong-Cheol Choi, Inseok Hwang

PDF

Open Access

TL;DR

This paper introduces a model-free reinforcement learning algorithm that guarantees satisfaction of hard state constraints using a multi-class SVM, ensuring safety and optimality in discrete systems.

Contribution

It presents a novel SVM-based policy optimization method that guarantees constraint satisfaction and convergence to the optimal policy in a model-free RL setting.

Findings

01

Guarantees satisfaction of hard state constraints.

02

Ensures convergence to the optimal policy.

03

Demonstrated effectiveness on multiple examples.

Abstract

Several works have addressed the problem of incorporating constraints in the reinforcement learning (RL) framework, however majority of them can only guarantee the satisfaction of soft constraints. In this work, we address the problem of satisfying hard state constraints in a model-free RL setting with the deterministic system dynamics. The proposed algorithm is developed for the discrete state and action space and utilizes a multi-class support vector machine (SVM) to represent the policy. The state constraints are incorporated in the SVM optimization framework to derive an analytical solution for determining the policy parameters. This final policy converges to a solution which is guaranteed to satisfy the constraints. Additionally, the proposed formulation adheres to the Q-learning framework and thus, also guarantees convergence to the optimal solution. The algorithm is demonstrated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Elevator Systems and Control

MethodsSupport Vector Machine · Q-Learning