Dynamical System Optimization

Emo Todorov

arXiv:2506.08340·cs.LG·June 11, 2025

Dynamical System Optimization

Emo Todorov

PDF

Open Access

TL;DR

This paper introduces a novel optimization framework for policies in dynamical systems that simplifies existing methods by focusing on autonomous system level algorithms, applicable across various domains including AI model tuning.

Contribution

It develops a unified, simpler optimization approach that aligns with existing policy gradient methods and extends to diverse applications like system identification and AI tuning.

Findings

01

Algorithms compute policy gradients and Hessians

02

Framework applies to behavioral cloning and mechanism design

03

Enables tuning of generative AI models

Abstract

We develop an optimization framework centered around a core idea: once a (parametric) policy is specified, control authority is transferred to the policy, resulting in an autonomous dynamical system. Thus we should be able to optimize policy parameters without further reference to controls or actions, and without directly using the machinery of approximate Dynamic Programming and Reinforcement Learning. Here we derive simpler algorithms at the autonomous system level, and show that they compute the same quantities as policy gradients and Hessians, natural gradients, proximal methods. Analogs to approximate policy iteration and off-policy learning are also available. Since policy parameters and other system parameters are treated uniformly, the same algorithms apply to behavioral cloning, mechanism design, system identification, learning of state estimators. Tuning of generative AI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Neural Networks and Reservoir Computing