Accelerating trajectory optimization with Sobolev-trained diffusion policies
Th\'eotime Le Hellard, Franki Nguimatsia Tiofack, Quentin Le Lidec, Justin Carpentier

TL;DR
This paper introduces a Sobolev-trained diffusion policy approach to warm-start trajectory optimization, significantly reducing solving time and inference latency by leveraging feedback gains and first-order information.
Contribution
It develops a novel Sobolev learning method for diffusion policies that incorporates feedback gains, improving warm-start efficiency for trajectory optimization.
Findings
Reduces trajectory optimization solving time by up to 20 times.
Learns effective initial guesses from very few trajectories.
Decreases inference latency by enabling fewer diffusion steps.
Abstract
Trajectory Optimization (TO) solvers exploit known system dynamics to compute locally optimal trajectories through iterative improvements. A downside is that each new problem instance is solved independently; therefore, convergence speed and quality of the solution found depend on the initial trajectory proposed. To improve efficiency, a natural approach is to warm-start TO with initial guesses produced by a learned policy trained on trajectories previously generated by the solver. Diffusion-based policies have recently emerged as expressive imitation learning models, making them promising candidates for this role. Yet, a counterintuitive challenge comes from the local optimality of TO demonstrations: when a policy is rolled out, small non-optimal deviations may push it into situations not represented in the training data, triggering compounding errors over long horizons. In this work,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
