Unifying Entropy Regularization in Optimal Control: From and Back to Classical Objectives via Iterated Soft Policies and Path Integral Solutions
Ajinkya Bhole, Mohammad Mahmoudi Filabadi, Guillaume Crevecoeur, Tom Lefebvre

TL;DR
This paper presents a unified KL-regularized framework for various optimal control problems, connecting classical and soft-policy formulations through iterated solutions and path integral methods.
Contribution
It introduces a generalized control formulation with independent KL penalties, unifies several control problems, and extends computational advantages to a broader class.
Findings
Unified KL-regularized control framework encompassing SOC and RSOC.
Iterated solutions recover original control objectives.
Path integral solutions and linear Bellman operators for synchronized KL weights.
Abstract
This paper develops a unified perspective on several optimal control formulations through the lens of Kullback-Leibler (KL) regularization. We propose a central problem that separates the KL penalties on policies and transitions with independent weights, thus generalizing the standard trajectory-level KL-regularization used in probabilistic optimal control. This umbrella formulation recovers various control problems: the classical Stochastic Optimal Control (SOC), Risk-Sensitive Stochastic Optimal Control (RSOC), and their policy-based KL-regularized counterparts, termed soft-policy SOC and RSOC, which yield tractable surrogates. Beyond being regularized variants, these soft-policy formulations majorize the original SOC and RSOC, thus, iterating their solutions recovers the original objectives. We further identify a synchronized case of soft-policy RSOC where the policy and transition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
