Value-of-Information based Arbitration between Model-based and   Model-free Control

Krishn Bera; Yash Mandilwar; Bapi Raju

arXiv:1912.05453·cs.LG·December 12, 2019·1 cites

Value-of-Information based Arbitration between Model-based and Model-free Control

Krishn Bera, Yash Mandilwar, Bapi Raju

PDF

Open Access

TL;DR

This paper introduces a novel arbitration framework based on the value of information to effectively combine model-based and model-free control in reinforcement learning, improving learning efficiency and performance.

Contribution

It proposes a quantitative, uncertainty-based arbitration method that integrates model-based and model-free RL, advancing the understanding of skill learning mechanisms.

Findings

01

Outperforms standard Q-learning in experiments

02

Better data and computational efficiency than existing methods

03

Effective integration of model-based and model-free control

Abstract

There have been numerous attempts in explaining the general learning behaviours using model-based and model-free methods. While the model-based control is flexible yet computationally expensive in planning, the model-free control is quick but inflexible. The model-based control is therefore immune from reward devaluation and contingency degradation. Multiple arbitration schemes have been suggested to achieve the data efficiency and computational efficiency of model-based and model-free control respectively. In this context, we propose a quantitative 'value of information' based arbitration between both the controllers in order to establish a general computational framework for skill learning. The interacting model-based and model-free reinforcement learning processes are arbitrated using an uncertainty-based value of information. We further show that our algorithm performs better than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Receptor Mechanisms and Signaling

MethodsQ-Learning