Duality-Based Stochastic Policy Optimization for Estimation with Unknown Noise Covariances
Shahriar Talebi, Amirhossein Taghvaei, Mehran Mesbahi

TL;DR
This paper introduces a duality-based approach to learn the optimal Kalman gain in state estimation when noise covariances are unknown, using policy learning and stochastic gradient methods with proven convergence.
Contribution
It formalizes the duality between control and estimation to develop a policy learning framework for Kalman gain estimation with convergence guarantees.
Findings
Global convergence of gradient descent for estimation gains.
Effective stochastic gradient method for unknown noise covariances.
Numerical examples demonstrating the approach's effectiveness.
Abstract
Duality of control and estimation allows mapping recent advances in data-guided control to the estimation setup. This paper formalizes and utilizes such a mapping to consider learning the optimal (steady-state) Kalman gain when process and measurement noise statistics are unknown. Specifically, building on the duality between synthesizing optimal control and estimation gains, the filter design problem is formalized as direct policy learning. In this direction, the duality is used to extend existing theoretical guarantees of direct policy updates for Linear Quadratic Regulator (LQR) to establish global convergence of the Gradient Descent (GD) algorithm for the estimation problem--while addressing subtle differences between the two synthesis problems. Subsequently, a Stochastic Gradient Descent (SGD) approach is adopted to learn the optimal Kalman gain without the knowledge of noise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Control Systems and Identification · Reinforcement Learning in Robotics
