Well-Posed KL-Regularized Control via Wasserstein and Kalman-Wasserstein KL Divergences

Viktor Stein; Adwait Datar; Nihat Ay

arXiv:2602.02250·math.OC·February 3, 2026

Well-Posed KL-Regularized Control via Wasserstein and Kalman-Wasserstein KL Divergences

Viktor Stein, Adwait Datar, Nihat Ay

PDF

Open Access

TL;DR

This paper introduces Wasserstein and Kalman-Wasserstein KL divergences as alternatives to classical KL regularization in reinforcement learning, providing well-posedness under support mismatch and low-noise limits, with demonstrated improvements in control tasks.

Contribution

It develops a unified geometric framework for KL analogues using transport-based geometries, leading to divergences that are finite under support mismatch and improve control regularization.

Findings

01

Divergences remain finite with support mismatch.

02

Regularized control problems become well-posed.

03

Improved control performance in experiments.

Abstract

Kullback-Leibler divergence (KL) regularization is widely used in reinforcement learning, but it becomes infinite under support mismatch and can degenerate in low-noise limits. Utilizing a unified information-geometric framework, we introduce (Kalman)-Wasserstein-based KL analogues by replacing the Fisher-Rao geometry in the dynamical formulation of the KL with transport-based geometries, and we derive closed-form values for common distribution families. These divergences remain finite under support mismatch and yield a geometric interpretation of regularization heuristics used in Kalman ensemble methods. We demonstrate the utility of these divergences in KL-regularized optimal control. In the fully tractable setting of linear time-invariant systems with Gaussian process noise, the classical KL reduces to a quadratic control penalty that becomes singular as process noise vanishes. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques