Safe-To-Explore State Spaces: Ensuring Safe Exploration in Policy Search with Hierarchical Task Optimization
Jens Lundell, Robert Krug, Erik Schaffernicht, Todor Stoyanov, Ville, Kyrki

TL;DR
This paper introduces a hierarchical task optimization framework for safe policy search in reinforcement learning, ensuring exploration occurs within safe state spaces by decomposing tasks into sub-tasks with safety constraints, validated through simulation and real robot experiments.
Contribution
It proposes a hierarchical approach that constrains exploration to safe sub-manifolds, improving safety and sample efficiency in robot skill learning.
Findings
Safe exploration is achieved by hierarchical task decomposition.
Sample efficiency is improved through sub-manifold learning.
Validation on simulation and real robot demonstrates effectiveness.
Abstract
Policy search reinforcement learning allows robots to acquire skills by themselves. However, the learning procedure is inherently unsafe as the robot has no a-priori way to predict the consequences of the exploratory actions it takes. Therefore, exploration can lead to collisions with the potential to harm the robot and/or the environment. In this work we address the safety aspect by constraining the exploration to happen in safe-to-explore state spaces. These are formed by decomposing target skills (e.g., grasping) into higher ranked sub-tasks (e.g., collision avoidance, joint limit avoidance) and lower ranked movement tasks (e.g., reaching). Sub-tasks are defined as concurrent controllers (policies) in different operational spaces together with associated Jacobians representing their joint-space mapping. Safety is ensured by only learning policies corresponding to lower ranked…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning
