MBDP: A Model-based Approach to Achieve both Robustness and Sample   Efficiency via Double Dropout Planning

Wanpeng Zhang; Xi Xiao; Yao Yao; Mingzhe Chen; Dijun Luo

arXiv:2108.01295·cs.LG·May 3, 2024

MBDP: A Model-based Approach to Achieve both Robustness and Sample Efficiency via Double Dropout Planning

Wanpeng Zhang, Xi Xiao, Yao Yao, Mingzhe Chen, Dijun Luo

PDF

Open Access

TL;DR

This paper introduces MBDP, a model-based reinforcement learning method that balances robustness and sample efficiency through dual dropout mechanisms, supported by theoretical analysis and experimental validation.

Contribution

It proposes a novel double-dropout planning framework that dynamically balances robustness and efficiency in model-based RL.

Findings

01

MBDP improves robustness with minimal efficiency loss.

02

MBDP maintains high sample efficiency while enhancing robustness.

03

Theoretical analysis confirms the effectiveness of the dropout mechanisms.

Abstract

Model-based reinforcement learning is a widely accepted solution for solving excessive sample demands. However, the predictions of the dynamics models are often not accurate enough, and the resulting bias may incur catastrophic decisions due to insufficient robustness. Therefore, it is highly desired to investigate how to improve the robustness of model-based RL algorithms while maintaining high sampling efficiency. In this paper, we propose Model-Based Double-dropout Planning (MBDP) to balance robustness and efficiency. MBDP consists of two kinds of dropout mechanisms, where the rollout-dropout aims to improve the robustness with a small cost of sample efficiency, while the model-dropout is designed to compensate for the lost efficiency at a slight expense of robustness. By combining them in a complementary way, MBDP provides a flexible control mechanism to meet different demands of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Viral Infectious Diseases and Gene Expression in Insects · Evolutionary Algorithms and Applications

MethodsDropout