On Policy Stochasticity in Mutual Information Optimal Control of Linear Systems
Shoju Enami, Kenji Kashima

TL;DR
This paper explores how the temperature parameter influences the stochasticity of policies in mutual information optimal control for linear systems, providing theoretical conditions and numerical validation.
Contribution
It establishes the relationship between the temperature parameter and policy stochasticity in mutual information optimal control, extending previous results and deriving new conditions.
Findings
Optimal policies exist for the MIOCP of linear systems.
Conditions on the temperature parameter determine policy stochasticity or determinism.
Numerical experiments validate the theoretical conditions.
Abstract
In recent years, mutual information optimal control has been proposed as an extension of maximum entropy optimal control. Both approaches introduce regularization terms to render the policy stochastic, and it is important to theoretically clarify the relationship between the temperature parameter (i.e., the coefficient of the regularization term) and the stochasticity of the policy. Unlike in maximum entropy optimal control, this relationship remains unexplored in mutual information optimal control. In this paper, we investigate this relationship for a mutual information optimal control problem (MIOCP) of discrete-time linear systems. After extending the result of a previous study of the MIOCP, we establish the existence of an optimal policy of the MIOCP, and then derive the respective conditions on the temperature parameter under which the optimal policy becomes stochastic and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
