On the Practical Consistency of Meta-Reinforcement Learning Algorithms

Zheng Xiong; Luisa Zintgraf; Jacob Beck; Risto Vuorio; Shimon Whiteson

arXiv:2112.00478·cs.LG·December 2, 2021

On the Practical Consistency of Meta-Reinforcement Learning Algorithms

Zheng Xiong, Luisa Zintgraf, Jacob Beck, Risto Vuorio, Shimon Whiteson

PDF

Open Access

TL;DR

This paper empirically examines whether the theoretical property of consistency in meta-reinforcement learning algorithms translates into practical benefits, finding that consistent algorithms generally adapt better to out-of-distribution tasks and can be made consistent through continued updates.

Contribution

It provides empirical evidence linking theoretical consistency with practical adaptation in meta-RL and shows how inconsistent algorithms can be made consistent with additional training.

Findings

01

Consistent algorithms usually adapt well to OOD tasks.

02

Inconsistent algorithms can be improved by continued updates.

03

Theoretical consistency correlates with practical adaptability.

Abstract

Consistency is the theoretical property of a meta learning algorithm that ensures that, under certain assumptions, it can adapt to any task at test time. An open question is whether and how theoretical consistency translates into practice, in comparison to inconsistent algorithms. In this paper, we empirically investigate this question on a set of representative meta-RL algorithms. We find that theoretically consistent algorithms can indeed usually adapt to out-of-distribution (OOD) tasks, while inconsistent ones cannot, although they can still fail in practice for reasons like poor exploration. We further find that theoretically inconsistent algorithms can be made consistent by continuing to update all agent components on the OOD tasks, and adapt as well or better than originally consistent ones. We conclude that theoretical consistency is indeed a desirable property, and inconsistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification