Detection-averse optimal and receding-horizon control for Markov   decision processes

Nan Li; Ilya Kolmanovsky; Anouck Girard

arXiv:1908.07691·eess.SY·August 22, 2019

Detection-averse optimal and receding-horizon control for Markov decision processes

Nan Li, Ilya Kolmanovsky, Anouck Girard

PDF

Open Access

TL;DR

This paper addresses the challenge of controlling Markov decision processes in a way that balances pursuing objectives with avoiding detection by adversaries, proposing scalable solution methods.

Contribution

It introduces a novel problem formulation for detection-averse control in MDPs and develops both exact and approximate solution approaches, including a receding-horizon method for larger problems.

Findings

01

Value iteration effectively solves small problems.

02

Receding-horizon approach scales to larger problems.

03

Examples demonstrate practical applicability.

Abstract

In this paper, we consider a Markov decision process (MDP), where the ego agent has a nominal objective to pursue while needs to hide its state from detection by an adversary. After formulating the problem, we first propose a value iteration (VI) approach to solve it. To overcome the "curse of dimensionality" and thus gain scalability to larger-sized problems, we then propose a receding-horizon optimization (RHO) approach to obtain approximate solutions. We use examples to illustrate and compare the VI and RHO approaches, and to show the potential of our problem formulation for practical applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Optimization and Search Problems