Value Gradient weighted Model-Based Reinforcement Learning

Claas Voelcker; Victor Liao; Animesh Garg; Amir-massoud; Farahmand

arXiv:2204.01464·cs.LG·June 22, 2023·5 cites

Value Gradient weighted Model-Based Reinforcement Learning

Claas Voelcker, Victor Liao, Animesh Garg, Amir-massoud, Farahmand

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces VaGraM, a novel value-gradient weighted model learning method that enhances model-based reinforcement learning by better accounting for value functions, leading to improved robustness and performance in complex environments.

Contribution

The paper proposes VaGraM, a new value-aware model learning approach that addresses limitations of existing methods, especially in small capacity models and high-dimensional states.

Findings

01

VaGraM achieves higher returns on Mujoco benchmarks.

02

It is more robust than maximum likelihood approaches.

03

The analysis highlights the importance of accounting for exploration and function approximation.

Abstract

Model-based reinforcement learning (MBRL) is a sample efficient technique to obtain control policies, yet unavoidable modeling errors often lead performance deterioration. The model in MBRL is often solely fitted to reconstruct dynamics, state observations in particular, while the impact of model error on the policy is not captured by the training objective. This leads to a mismatch between the intended goal of MBRL, enabling good policy and value learning, and the target of the loss function employed in practice, future state prediction. Naive intuition would suggest that value-aware model learning would fix this problem and, indeed, several solutions to this objective mismatch problem have been proposed based on theoretical analysis. However, they tend to be inferior in practice to commonly used maximum likelihood (MLE) based approaches. In this paper we propose the Value-gradient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pairlab/vagram
pytorchOfficial

Videos

Value Gradient weighted Model-Based Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics