Multi-task Reinforcement Learning with a Planning Quasi-Metric

Vincent Micheli; Karthigan Sinnathamby; Fran\c{c}ois Fleuret

arXiv:2002.03240·cs.LG·December 8, 2020·1 cites

Multi-task Reinforcement Learning with a Planning Quasi-Metric

Vincent Micheli, Karthigan Sinnathamby, Fran\c{c}ois Fleuret

PDF

Open Access

TL;DR

This paper presents a novel reinforcement learning method that integrates a planning quasi-metric with task-specific aimers, enabling efficient multi-task learning by sharing a task-agnostic environment model, resulting in faster training.

Contribution

It introduces a planning quasi-metric combined with aimers for multi-task reinforcement learning, allowing shared environment modeling and improved training efficiency.

Findings

01

Achieved multiple-fold training speed-up on bit-flip problem.

02

Demonstrated efficiency gains in MuJoCo robotic arm simulations.

03

Shared environment dynamics model enhances multi-task learning.

Abstract

We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of steps required to go from any state to another, with task-specific "aimers" that compute a target state to reach a given goal. This decomposition allows the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment's dynamics and can be learned in a dense and unsupervised manner. We achieve multiple-fold training speed-up compared to recently published methods on the standard bit-flip problem and in the MuJoCo robotic arm simulator.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Path Planning Algorithms · Scheduling and Optimization Algorithms