Soft Action Priors: Towards Robust Policy Transfer

Matheus Centa; Philippe Preux

arXiv:2209.09882·cs.LG·September 21, 2022

Soft Action Priors: Towards Robust Policy Transfer

Matheus Centa, Philippe Preux

PDF

Open Access 2 Videos

TL;DR

This paper introduces adaptive algorithms that leverage soft action priors, including suboptimal ones, to improve policy transfer in reinforcement learning, demonstrating state-of-the-art results and robustness in both tabular and continuous action settings.

Contribution

It develops novel adaptive methods for utilizing suboptimal action priors in RL, enhancing robustness and performance over existing policy distillation techniques.

Findings

01

Achieved state-of-the-art performance in tabular experiments.

02

Improved stability and robustness in continuous action deep RL.

03

Effectively leverages suboptimal priors for better policy transfer.

Abstract

Despite success in many challenging problems, reinforcement learning (RL) is still confronted with sample inefficiency, which can be mitigated by introducing prior knowledge to agents. However, many transfer techniques in reinforcement learning make the limiting assumption that the teacher is an expert. In this paper, we use the action prior from the Reinforcement Learning as Inference framework - that is, a distribution over actions at each state which resembles a teacher policy, rather than a Bayesian prior - to recover state-of-the-art policy distillation techniques. Then, we propose a class of adaptive methods that can robustly exploit action priors by combining reward shaping and auxiliary regularization losses. In contrast to prior work, we develop algorithms for leveraging suboptimal action priors that may nevertheless impart valuable knowledge - which we call soft action priors.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Soft Action Priors: Towards Robust Policy Transfer· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Adversarial Robustness in Machine Learning