Duality-Based Stochastic Policy Optimization for Estimation with Unknown   Noise Covariances

Shahriar Talebi; Amirhossein Taghvaei; Mehran Mesbahi

arXiv:2210.14878·eess.SY·March 8, 2023

Duality-Based Stochastic Policy Optimization for Estimation with Unknown Noise Covariances

Shahriar Talebi, Amirhossein Taghvaei, Mehran Mesbahi

PDF

Open Access

TL;DR

This paper introduces a duality-based approach to learn the optimal Kalman gain in state estimation when noise covariances are unknown, using policy learning and stochastic gradient methods with proven convergence.

Contribution

It formalizes the duality between control and estimation to develop a policy learning framework for Kalman gain estimation with convergence guarantees.

Findings

01

Global convergence of gradient descent for estimation gains.

02

Effective stochastic gradient method for unknown noise covariances.

03

Numerical examples demonstrating the approach's effectiveness.

Abstract

Duality of control and estimation allows mapping recent advances in data-guided control to the estimation setup. This paper formalizes and utilizes such a mapping to consider learning the optimal (steady-state) Kalman gain when process and measurement noise statistics are unknown. Specifically, building on the duality between synthesizing optimal control and estimation gains, the filter design problem is formalized as direct policy learning. In this direction, the duality is used to extend existing theoretical guarantees of direct policy updates for Linear Quadratic Regulator (LQR) to establish global convergence of the Gradient Descent (GD) algorithm for the estimation problem--while addressing subtle differences between the two synthesis problems. Subsequently, a Stochastic Gradient Descent (SGD) approach is adopted to learn the optimal Kalman gain without the knowledge of noise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Control Systems and Identification · Reinforcement Learning in Robotics