Controlling a Markov Decision Process with an Abrupt Change in the   Transition Kernel

Nathan Dahlin; Subhonmesh Bose; Venugopal V. Veeravalli

arXiv:2210.04098·eess.SY·October 11, 2022

Controlling a Markov Decision Process with an Abrupt Change in the Transition Kernel

Nathan Dahlin, Subhonmesh Bose, Venugopal V. Veeravalli

PDF

Open Access

TL;DR

This paper develops a control strategy for Markov decision processes that experience sudden changes in their transition dynamics, framing it as an optimal stopping problem and deriving a threshold-based detection policy.

Contribution

It introduces a novel approach to control MDPs with abrupt transition changes by formulating the problem as a quickest change detection task with Markovian data.

Findings

01

The proposed policy effectively detects mode changes in MDPs.

02

Numerical experiments demonstrate the policy's efficiency and properties.

03

The method outperforms traditional control strategies in abrupt change scenarios.

Abstract

We consider the control of a Markov decision process (MDP) that undergoes an abrupt change in its transition kernel (mode). We formulate the problem of minimizing regret under control-switching based on mode change detection, compared to a mode-observing controller, as an optimal stopping problem. Using a sequence of approximations, we reduce it to a quickest change detection (QCD) problem with Markovian data, for which we characterize a state-dependent threshold-type optimal change detection policy. Numerical experiments illustrate various properties of our control-switching policy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Queuing Theory Analysis · Healthcare Operations and Scheduling Optimization · Simulation Techniques and Applications