Sub-optimality bounds for certainty equivalent policies in partially observed systems
Berk Bozkurt, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

TL;DR
This paper extends the certainty equivalence principle to general partially observed stochastic systems, providing bounds on how sub-optimal certainty equivalent policies can be compared to optimal policies.
Contribution
It introduces a generalized framework for certainty equivalent policies in non-linear systems and derives bounds on their sub-optimality, broadening the classical linear case.
Findings
Derived upper bounds on sub-optimality of certainty equivalent policies
Illustrated results with multiple examples
Extended the principle beyond linear systems with quadratic costs
Abstract
In this paper, we present a generalization of the certainty equivalence principle of stochastic control. One interpretation of the classical certainty equivalence principle for linear systems with output feedback and quadratic costs is as follows: the optimal action at each time is obtained by evaluating the optimal state-feedback policy of the stochastic linear system at the minimum mean square error (MMSE) estimate of the state. Motivated by this interpretation, we consider certainty equivalent policies for general (non-linear) partially observed stochastic systems that allow for any state estimate rather than restricting to MMSE estimates. In such settings, the certainty equivalent policy is not optimal. For models where the cost and the dynamics are smooth in an appropriate sense, we derive upper bounds on the sub-optimality of certainty equivalent policies. We present several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStability and Control of Uncertain Systems · Adaptive Dynamic Programming Control · Advanced Control Systems Optimization
