Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control
Quentin Le Lidec, Wilson Jallet, Ivan Laptev, Cordelia Schmid, Justin, Carpentier

TL;DR
This paper introduces a novel method combining trajectory optimization and policy learning using Sobolev learning and augmented Lagrangian techniques to improve robot control accuracy and learning efficiency.
Contribution
It proposes new improvements to existing hybrid RL and TO methods by leveraging sensitivity information and enforcing consensus, enhancing global policy learning.
Findings
Faster convergence to feasible policies in robotics tasks
Improved control accuracy over existing methods
Effective enforcement of TO-policy consensus
Abstract
Reinforcement learning (RL) and trajectory optimization (TO) present strong complementary advantages. On one hand, RL approaches are able to learn global control policies directly from data, but generally require large sample sizes to properly converge towards feasible policies. On the other hand, TO methods are able to exploit gradient-based information extracted from simulators to quickly converge towards a locally optimal control trajectory which is only valid within the vicinity of the solution. Over the past decade, several approaches have aimed to adequately combine the two classes of methods in order to obtain the best of both worlds. Following on from this line of research, we propose several improvements on top of these approaches to learn global control policies quicker, notably by leveraging sensitivity information stemming from TO methods via Sobolev learning, and augmented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms
