DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning
Weikang Wan, Ziyu Wang, Yufei Wang, Zackory Erickson, David Held

TL;DR
DiffTORI introduces a differentiable trajectory optimization framework for deep reinforcement and imitation learning, enabling end-to-end learning of dynamics and cost functions, and outperforms existing methods on complex robotic tasks.
Contribution
The paper presents DiffTORI, a novel approach that leverages differentiable trajectory optimization for end-to-end learning in reinforcement and imitation learning.
Findings
Outperforms prior state-of-the-art methods on robotic manipulation tasks.
Effective in high-dimensional sensory observation settings.
Addresses objective mismatch in model-based RL.
Abstract
This paper introduces DiffTORI, which utilizes Differentiable Trajectory Optimization as the policy representation to generate actions for deep Reinforcement and Imitation learning. Trajectory optimization is a powerful and widely used algorithm in control, parameterized by a cost and a dynamics function. The key to our approach is to leverage the recent progress in differentiable trajectory optimization, which enables computing the gradients of the loss with respect to the parameters of trajectory optimization. As a result, the cost and dynamics functions of trajectory optimization can be learned end-to-end. DiffTORI addresses the ``objective mismatch'' issue of prior model-based RL algorithms, as the dynamics model in DiffTORI is learned to directly maximize task performance by differentiating the policy gradient loss through the trajectory optimization process. We further benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Human Pose and Action Recognition
MethodsDiffusion
