Safety Correction from Baseline: Towards the Risk-aware Policy in   Robotics via Dual-agent Reinforcement Learning

Linrui Zhang; Zichen Yan; Li Shen; Shoujie Li; Xueqian; Wang; Dacheng Tao

arXiv:2212.06998·cs.LG·December 15, 2022

Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning

Linrui Zhang, Zichen Yan, Li Shen, Shoujie Li, Xueqian, Wang, Dacheng Tao

PDF

Open Access

TL;DR

This paper introduces a dual-agent reinforcement learning framework for robotics that enhances safety and data efficiency by decoupling reward maximization from safety constraint learning, enabling risk-aware control in complex tasks.

Contribution

The paper proposes a novel dual-agent RL approach with a baseline and a safe agent, improving safety and sample efficiency in robotic control tasks.

Findings

01

Outperforms state-of-the-art safe RL algorithms on robot locomotion and manipulation tasks.

02

Learns feasible and risk-averse policies from unsafe pre-trained models.

03

Requires fewer interactions to achieve near-optimal safe policies.

Abstract

Learning a risk-aware policy is essential but rather challenging in unstructured robotic tasks. Safe reinforcement learning methods open up new possibilities to tackle this problem. However, the conservative policy updates make it intractable to achieve sufficient exploration and desirable performance in complex, sample-expensive environments. In this paper, we propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. Concretely, the baseline agent is responsible for maximizing rewards under standard RL settings. Thus, it is compatible with off-the-shelf training techniques of unconstrained optimization, exploration and exploitation. On the other hand, the safe agent mimics the baseline agent for policy improvement and learns to fulfill…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Autonomous Vehicle Technology and Safety