DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning

Weikang Wan; Ziyu Wang; Yufei Wang; Zackory Erickson; David Held

arXiv:2402.05421·cs.LG·June 16, 2025·1 cites

DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning

Weikang Wan, Ziyu Wang, Yufei Wang, Zackory Erickson, David Held

PDF

Open Access 1 Repo 1 Video

TL;DR

DiffTORI introduces a differentiable trajectory optimization framework for deep reinforcement and imitation learning, enabling end-to-end learning of dynamics and cost functions, and outperforms existing methods on complex robotic tasks.

Contribution

The paper presents DiffTORI, a novel approach that leverages differentiable trajectory optimization for end-to-end learning in reinforcement and imitation learning.

Findings

01

Outperforms prior state-of-the-art methods on robotic manipulation tasks.

02

Effective in high-dimensional sensory observation settings.

03

Addresses objective mismatch in model-based RL.

Abstract

This paper introduces DiffTORI, which utilizes Differentiable Trajectory Optimization as the policy representation to generate actions for deep Reinforcement and Imitation learning. Trajectory optimization is a powerful and widely used algorithm in control, parameterized by a cost and a dynamics function. The key to our approach is to leverage the recent progress in differentiable trajectory optimization, which enables computing the gradients of the loss with respect to the parameters of trajectory optimization. As a result, the cost and dynamics functions of trajectory optimization can be learned end-to-end. DiffTORI addresses the ``objective mismatch'' issue of prior model-based RL algorithms, as the dynamics model in DiffTORI is learned to directly maximize task performance by differentiating the policy gradient loss through the trajectory optimization process. We further benchmark…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wkwan7/difftori
noneOfficial

Videos

DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning· slideslive

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Human Pose and Action Recognition

MethodsDiffusion