An Efficient On-Policy Deep Learning Framework for Stochastic Optimal Control
Mengjian Hua, Mathieu Lauri\`ere, Eric Vanden-Eijnden

TL;DR
This paper introduces a new on-policy deep learning algorithm for stochastic optimal control that uses Girsanov theorem to improve computational speed and scalability, enabling efficient high-dimensional control policy optimization.
Contribution
The method uniquely leverages Girsanov theorem for direct on-policy gradient computation, avoiding complex backpropagation and adjoint solutions, thus enhancing efficiency and scalability.
Findings
Significant speedup over existing methods.
Improved memory efficiency in high-dimensional problems.
Successful application to sampling and diffusion models.
Abstract
We present a novel on-policy algorithm for solving stochastic optimal control (SOC) problems. By leveraging the Girsanov theorem, our method directly computes on-policy gradients of the SOC objective without expensive backpropagation through stochastic differential equations or adjoint problem solutions. This approach significantly accelerates the optimization of neural network control policies while scaling efficiently to high-dimensional problems and long time horizons. We evaluate our method on classical SOC benchmarks as well as applications to sampling from unnormalized distributions via Schr\"odinger-F\"ollmer processes and fine-tuning pre-trained diffusion models. Experimental results demonstrate substantial improvements in both computational speed and memory efficiency compared to existing approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Adaptive Dynamic Programming Control · Reinforcement Learning in Robotics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion
