Enforcing the consensus between Trajectory Optimization and Policy   Learning for precise robot control

Quentin Le Lidec; Wilson Jallet; Ivan Laptev; Cordelia Schmid; Justin; Carpentier

arXiv:2209.09006·cs.RO·February 17, 2023

Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control

Quentin Le Lidec, Wilson Jallet, Ivan Laptev, Cordelia Schmid, Justin, Carpentier

PDF

Open Access

TL;DR

This paper introduces a novel method combining trajectory optimization and policy learning using Sobolev learning and augmented Lagrangian techniques to improve robot control accuracy and learning efficiency.

Contribution

It proposes new improvements to existing hybrid RL and TO methods by leveraging sensitivity information and enforcing consensus, enhancing global policy learning.

Findings

01

Faster convergence to feasible policies in robotics tasks

02

Improved control accuracy over existing methods

03

Effective enforcement of TO-policy consensus

Abstract

Reinforcement learning (RL) and trajectory optimization (TO) present strong complementary advantages. On one hand, RL approaches are able to learn global control policies directly from data, but generally require large sample sizes to properly converge towards feasible policies. On the other hand, TO methods are able to exploit gradient-based information extracted from simulators to quickly converge towards a locally optimal control trajectory which is only valid within the vicinity of the solution. Over the past decade, several approaches have aimed to adequately combine the two classes of methods in order to obtain the best of both worlds. Following on from this line of research, we propose several improvements on top of these approaches to learn global control policies quicker, notably by leveraging sensitivity information stemming from TO methods via Sobolev learning, and augmented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms