Data-Efficient Reinforcement Learning for Malaria Control

Lixin Zou; Long Xia; Linfang Hou; Xiangyu Zhao; and Dawei Yin

arXiv:2105.01620·cs.LG·May 6, 2021

Data-Efficient Reinforcement Learning for Malaria Control

Lixin Zou, Long Xia, Linfang Hou, Xiangyu Zhao, and Dawei Yin

PDF

Open Access

TL;DR

This paper presents VB-MCTS, a data-efficient, model-based reinforcement learning method using Gaussian Processes and variance-bonus rewards, enabling effective malaria control policies with minimal data and trials.

Contribution

Introduction of VB-MCTS, a novel, sample-efficient reinforcement learning approach combining Gaussian Process models and variance-based exploration for complex, cost-sensitive tasks like malaria control.

Findings

01

VB-MCTS outperforms state-of-the-art methods on malaria control tasks.

02

The method demonstrates high data efficiency with few trials.

03

Experimental results show superior performance in a competitive RL environment.

Abstract

Sequential decision-making under cost-sensitive tasks is prohibitively daunting, especially for the problem that has a significant impact on people's daily lives, such as malaria control, treatment recommendation. The main challenge faced by policymakers is to learn a policy from scratch by interacting with a complex environment in a few trials. This work introduces a practical, data-efficient policy learning method, named Variance-Bonus Monte Carlo Tree Search~(VB-MCTS), which can copy with very little data and facilitate learning from scratch in only a few trials. Specifically, the solution is a model-based reinforcement learning method. To avoid model bias, we apply Gaussian Process~(GP) regression to estimate the transitions explicitly. With the GP world model, we propose a variance-bonus reward to measure the uncertainty about the world. Adding the reward to the planning with MCTS…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research · Reinforcement Learning in Robotics