Gaining efficiency in deep policy gradient method for continuous-time   optimal control problems

Arash Fahim; Md. Arafatur Rahman

arXiv:2502.14141·math.OC·February 25, 2025

Gaining efficiency in deep policy gradient method for continuous-time optimal control problems

Arash Fahim, Md. Arafatur Rahman

PDF

Open Access

TL;DR

This paper introduces an efficient multi-scale deep policy gradient method for continuous-time optimal control, optimizing resource allocation and neural network complexity, demonstrated on linear-quadratic problems.

Contribution

It presents a novel multi-scale approach that manages computational resources and neural network complexity for continuous-time control problems.

Findings

01

Effective resource allocation improves training efficiency.

02

Method achieves accurate policies on linear-quadratic control.

03

Theoretical results guide optimal resource distribution.

Abstract

In this paper, we propose an efficient implementation of deep policy gradient method (PGM) for optimal control problems in continuous time. The proposed method has the ability to manage the allocation of computational resources, number of trajectories, and complexity of architecture of the neural network. This is, in particular, important for continuous-time problems that require a fine time discretization. Each step of this method focuses on a different time scale and learns a policy, modeled by a neural network, for a discretized optimal control problem. The first step has the coarsest time discretization. As we proceed to other steps, the time discretization becomes finer. The optimal trained policy in each step is also used to provide data for the next step. We accompany the multi-scale deep PGM with a theoretical result on allocation of computational resources to obtain a targeted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Machine Learning and ELM