Imitating careful experts to avoid catastrophic events
Jack R. P. Hanslope, Laurence Aitchison

TL;DR
This paper introduces a method to improve safe reinforcement learning for robots by incorporating human carefulness signals into inverse reinforcement learning, helping distinguish between undesirable and catastrophic outcomes.
Contribution
It proposes a novel approach that uses human carefulness signals in IRL to better identify and avoid catastrophic outcomes in robotic control.
Findings
Carefulness signals help disambiguate catastrophic from undesirable outcomes.
Incorporating carefulness improves safety in IRL-based robotic control.
Method enhances the ability to prevent injuries in human-robot interactions.
Abstract
RL is increasingly being used to control robotic systems that interact closely with humans. This interaction raises the problem of safe RL: how to ensure that a RL-controlled robotic system never, for instance, injures a human. This problem is especially challenging in rich, realistic settings where it is not even possible to clearly write down a reward function which incorporates these outcomes. In these circumstances, perhaps the only viable approach is based on IRL, which infers rewards from human demonstrations. However, IRL is massively underdetermined as many different rewards can lead to the same optimal policies; we show that this makes it difficult to distinguish catastrophic outcomes (such as injuring a human) from merely undesirable outcomes. Our key insight is that humans do display different behaviour when catastrophic outcomes are possible: they become much more careful.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
