Detection-averse optimal and receding-horizon control for Markov decision processes
Nan Li, Ilya Kolmanovsky, Anouck Girard

TL;DR
This paper addresses the challenge of controlling Markov decision processes in a way that balances pursuing objectives with avoiding detection by adversaries, proposing scalable solution methods.
Contribution
It introduces a novel problem formulation for detection-averse control in MDPs and develops both exact and approximate solution approaches, including a receding-horizon method for larger problems.
Findings
Value iteration effectively solves small problems.
Receding-horizon approach scales to larger problems.
Examples demonstrate practical applicability.
Abstract
In this paper, we consider a Markov decision process (MDP), where the ego agent has a nominal objective to pursue while needs to hide its state from detection by an adversary. After formulating the problem, we first propose a value iteration (VI) approach to solve it. To overcome the "curse of dimensionality" and thus gain scalability to larger-sized problems, we then propose a receding-horizon optimization (RHO) approach to obtain approximate solutions. We use examples to illustrate and compare the VI and RHO approaches, and to show the potential of our problem formulation for practical applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Optimization and Search Problems
