Making Universal Policies Universal

Niklas H\"opner; David Kuric; Herke van Hoof

arXiv:2502.14777·cs.AI·February 21, 2025

Making Universal Policies Universal

Niklas H\"opner, David Kuric, Herke van Hoof

PDF

Open Access 1 Repo

TL;DR

This paper introduces a universal policy framework for multi-agent sequential decision tasks, leveraging a diffusion-based planner and inverse dynamics model to enable positive transfer and generalization across diverse agents.

Contribution

It proposes a novel training method for a universal policy that pools data from multiple agents, improving transferability and generalization in complex environments.

Findings

01

Achieved up to 42.20% improvement in task accuracy over single-agent training.

02

Demonstrated positive transfer across different agents in BabyAI environment.

03

Showed the planner's ability to generalize to unseen agents.

Abstract

The development of a generalist agent capable of solving a wide range of sequential decision-making tasks remains a significant challenge. We address this problem in a cross-agent setup where agents share the same observation space but differ in their action spaces. Our approach builds on the universal policy framework, which decouples policy learning into two stages: a diffusion-based planner that generates observation sequences and an inverse dynamics model that assigns actions to these plans. We propose a method for training the planner on a joint dataset composed of trajectories from all agents. This method offers the benefit of positive transfer by pooling data from different agents, while the primary challenge lies in adapting shared plans to each agent's unique constraints. We evaluate our approach on the BabyAI environment, covering tasks of varying complexity, and demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NikeHop/UniversalPolicies
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Robot Manipulation and Learning