Simulator-Driven Deceptive Control via Path Integral Approach
Apurva Patil, Mustafa O. Karabag, Takashi Tanaka, Ufuk Topcu

TL;DR
This paper introduces a simulation-based method for synthesizing optimal deceptive control policies in stochastic systems, enabling agents to hide deviations from supervisor policies efficiently using path integral control and Monte Carlo simulations.
Contribution
It presents a novel approach combining path integral control with Monte Carlo methods to generate deceptive policies in continuous-state stochastic systems, addressing computational challenges.
Findings
Path integral control enables efficient deception policy synthesis.
Monte Carlo simulations allow online computation of deceptive actions.
The method converges asymptotically to the optimal control distribution.
Abstract
We consider a setting where a supervisor delegates an agent to perform a certain control task, while the agent is incentivized to deviate from the given policy to achieve its own goal. In this work, we synthesize the optimal deceptive policies for an agent who attempts to hide its deviations from the supervisor's policy. We study the deception problem in the continuous-state discrete-time stochastic dynamics setting and, using motivations from hypothesis testing theory, formulate a Kullback-Leibler control problem for the synthesis of deceptive policies. This problem can be solved using backward dynamic programming in principle, which suffers from the curse of dimensionality. However, under the assumption of deterministic state dynamics, we show that the optimal deceptive actions can be generated using path integral control. This allows the agent to numerically compute the deceptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
