# Failure-aware Policy Learning for Self-assessable Robotics Tasks

**Authors:** Kechun Xu, Runjian Chen, Shuqi Zhao, Zizhang Li, Hongxiang Yu, Ci, Chen, Yue Wang, Rong Xiong

arXiv: 2302.13024 · 2023-02-28

## TL;DR

This paper introduces a failure-aware policy learning approach for robotic tasks that leverages self-assessment results to improve action selection, leading to higher success rates with fewer trials compared to traditional elimination strategies.

## Contribution

It proposes two neural network architectures that incorporate self-assessment failure information into policy learning, capturing dependencies between failures and remaining actions.

## Key findings

- Higher task success rates with fewer trials.
- Outperforms process-of-elimination strategy when actions are correlated.
- Effective in three different robotic tasks.

## Abstract

Self-assessment rules play an essential role in safe and effective real-world robotic applications, which verify the feasibility of the selected action before actual execution. But how to utilize the self-assessment results to re-choose actions remains a challenge. Previous methods eliminate the selected action evaluated as failed by the self-assessment rules, and re-choose one with the next-highest affordance~(i.e. process-of-elimination strategy [1]), which ignores the dependency between the self-assessment results and the remaining untried actions. However, this dependency is important since the previous failures might help trim the remaining over-estimated actions. In this paper, we set to investigate this dependency by learning a failure-aware policy. We propose two architectures for the failure-aware policy by representing the self-assessment results of previous failures as the variable state, and leveraging recurrent neural networks to implicitly memorize the previous failures. Experiments conducted on three tasks demonstrate that our method can achieve better performances with higher task success rates by less trials. Moreover, when the actions are correlated, learning a failure-aware policy can achieve better performance than the process-of-elimination strategy.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.13024/full.md

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/2302.13024/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/2302.13024/full.md

---
Source: https://tomesphere.com/paper/2302.13024