Optimal Epidemic Control as a Contextual Combinatorial Bandit with Budget
Baihan Lin, Djallel Bouneffouf

TL;DR
This paper models epidemic control policy optimization as a contextual combinatorial bandit problem, aiming to balance resource use and epidemic suppression through real-time intervention planning.
Contribution
It introduces a novel bandit-based framework for dynamic epidemic policy optimization considering multiple criteria and resource constraints.
Findings
The approach achieves Pareto optimal solutions in simulated epidemic scenarios.
It effectively balances intervention stringency and epidemic control.
The method adapts to real-time data for improved policy recommendations.
Abstract
In light of the COVID-19 pandemic, it is an open challenge and critical practical problem to find a optimal way to dynamically prescribe the best policies that balance both the governmental resources and epidemic control in different countries and regions. To solve this multi-dimensional tradeoff of exploitation and exploration, we formulate this technical challenge as a contextual combinatorial bandit problem that jointly optimizes a multi-criteria reward function. Given the historical daily cases in a region and the past intervention plans in place, the agent should generate useful intervention plans that policy makers can implement in real time to minimizing both the number of daily COVID-19 cases and the stringency of the recommended interventions. We prove this concept with simulations of multiple realistic policy making scenarios and demonstrate a clear advantage in providing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · COVID-19 epidemiological studies
