TL;DR
This paper introduces policy decomposition, a novel approximation method for optimal control that provides explicit suboptimality estimates, enabling efficient control policy synthesis for complex systems with quantifiable performance bounds.
Contribution
The paper presents a new policy decomposition approach that decomposes complex control problems into subproblems with explicit suboptimality estimates, improving efficiency and performance prediction.
Findings
Estimates accurately identify the best control combinations.
Control policies are computed faster with minimal performance loss.
Method successfully applied to cart-pole, biped, and manipulator systems.
Abstract
Numerically computing global policies to optimal control problems for complex dynamical systems is mostly intractable. In consequence, a number of approximation methods have been developed. However, none of the current methods can quantify by how much the resulting control underperforms the elusive globally optimal solution. Here we propose policy decomposition, an approximation method with explicit suboptimality estimates. Our method decomposes the optimal control problem into lower-dimensional subproblems, whose optimal solutions are recombined to build a control policy for the entire system. Many such combinations exist, and we introduce the value error and its LQR and DDP estimates to predict the suboptimality of possible combinations and prioritize the ones that minimize it. Using a cart-pole, a 3-link balancing biped and N-link planar manipulators as example systems, we find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
