Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL

Abhinav Bhatia; Philip S. Thomas; Shlomo Zilberstein

arXiv:2206.02380·cs.LG·June 8, 2022

Adaptive Rollout Length for Model-Based RL Using Model-Free Deep RL

Abhinav Bhatia, Philip S. Thomas, Shlomo Zilberstein

PDF

Open Access

TL;DR

This paper introduces a method to dynamically adapt the rollout length in model-based reinforcement learning using model-free deep RL, improving policy quality and efficiency by optimizing hyperparameters during training.

Contribution

It presents a novel approach to automatically tune the rollout length as a meta-level decision problem solved with deep RL, outperforming heuristic methods.

Findings

01

Outperforms heuristic baselines on benchmark environments

02

Dynamically adapting rollout length improves policy performance

03

Method effectively balances prediction accuracy and efficiency

Abstract

Model-based reinforcement learning promises to learn an optimal policy from fewer interactions with the environment compared to model-free reinforcement learning by learning an intermediate model of the environment in order to predict future interactions. When predicting a sequence of interactions, the rollout length, which limits the prediction horizon, is a critical hyperparameter as accuracy of the predictions diminishes in the regions that are further away from real experience. As a result, with a longer rollout length, an overall worse policy is learned in the long run. Thus, the hyperparameter provides a trade-off between quality and efficiency. In this work, we frame the problem of tuning the rollout length as a meta-level sequential decision-making problem that optimizes the final policy learned by model-based reinforcement learning given a fixed budget of environment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Software Engineering Research · Machine Learning and Data Classification