Plan Arithmetic: Compositional Plan Vectors for Multi-Task Control

Coline Devin; Daniel Geng; Pieter Abbeel; Trevor Darrell; Sergey; Levine

arXiv:1910.14033·cs.LG·August 4, 2020·1 cites

Plan Arithmetic: Compositional Plan Vectors for Multi-Task Control

Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey, Levine

PDF

Open Access

TL;DR

This paper introduces compositional plan vectors (CPVs) that enable agents to learn and generalize to complex, composed tasks by summing subtasks, supporting task composition and arithmetic operations without extra supervision.

Contribution

The paper proposes CPVs, a novel method for representing and composing tasks in reinforcement learning, allowing for efficient multi-task learning and zero-shot task composition.

Findings

01

CPVs enable policies to generalize to twice as many skills as seen during training.

02

CPVs support arithmetic operations, allowing task composition without additional training.

03

The method works within a one-shot imitation learning framework without extra supervision.

Abstract

Autonomous agents situated in real-world environments must be able to master large repertoires of skills. While a single short skill can be learned quickly, it would be impractical to learn every task independently. Instead, the agent should share knowledge across behaviors such that each task can be learned efficiently, and such that the resulting model can generalize to new tasks, especially ones that are compositions or subsets of tasks seen previously. A policy conditioned on a goal or demonstration has the potential to share knowledge between tasks if it sees enough diversity of inputs. However, these methods may not generalize to a more complex task at test time. We introduce compositional plan vectors (CPVs) to enable a policy to perform compositions of tasks without additional supervision. CPVs represent trajectories as the sum of the subtasks within them. We show that CPVs can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Topic Modeling · Multimodal Machine Learning Applications

MethodsTest