Controlling a Markov Decision Process with an Abrupt Change in the Transition Kernel
Nathan Dahlin, Subhonmesh Bose, Venugopal V. Veeravalli

TL;DR
This paper develops a control strategy for Markov decision processes that experience sudden changes in their transition dynamics, framing it as an optimal stopping problem and deriving a threshold-based detection policy.
Contribution
It introduces a novel approach to control MDPs with abrupt transition changes by formulating the problem as a quickest change detection task with Markovian data.
Findings
The proposed policy effectively detects mode changes in MDPs.
Numerical experiments demonstrate the policy's efficiency and properties.
The method outperforms traditional control strategies in abrupt change scenarios.
Abstract
We consider the control of a Markov decision process (MDP) that undergoes an abrupt change in its transition kernel (mode). We formulate the problem of minimizing regret under control-switching based on mode change detection, compared to a mode-observing controller, as an optimal stopping problem. Using a sequence of approximations, we reduce it to a quickest change detection (QCD) problem with Markovian data, for which we characterize a state-dependent threshold-type optimal change detection policy. Numerical experiments illustrate various properties of our control-switching policy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Healthcare Operations and Scheduling Optimization · Simulation Techniques and Applications
