Multi-task Reinforcement Learning with a Planning Quasi-Metric
Vincent Micheli, Karthigan Sinnathamby, Fran\c{c}ois Fleuret

TL;DR
This paper presents a novel reinforcement learning method that integrates a planning quasi-metric with task-specific aimers, enabling efficient multi-task learning by sharing a task-agnostic environment model, resulting in faster training.
Contribution
It introduces a planning quasi-metric combined with aimers for multi-task reinforcement learning, allowing shared environment modeling and improved training efficiency.
Findings
Achieved multiple-fold training speed-up on bit-flip problem.
Demonstrated efficiency gains in MuJoCo robotic arm simulations.
Shared environment dynamics model enhances multi-task learning.
Abstract
We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of steps required to go from any state to another, with task-specific "aimers" that compute a target state to reach a given goal. This decomposition allows the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment's dynamics and can be learned in a dense and unsupervised manner. We achieve multiple-fold training speed-up compared to recently published methods on the standard bit-flip problem and in the MuJoCo robotic arm simulator.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Scheduling and Optimization Algorithms
