PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning

Chengyang Ying; Zhongkai Hao; Xinning Zhou; Xuezhou Xu; Hang Su; Xingxing Zhang; Jun Zhu

arXiv:2405.14073·cs.LG·May 20, 2025

PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning

Chengyang Ying, Zhongkai Hao, Xinning Zhou, Xuezhou Xu, Hang Su, Xingxing Zhang, Jun Zhu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces PEAC, an unsupervised pre-training method for reinforcement learning agents to adapt across different embodiments, improving generalization and transfer in diverse environments.

Contribution

We propose CEURL and develop PEAC, a novel algorithm for embodiment-aware, task-agnostic pre-training using unsupervised RL in reward-free settings.

Findings

01

PEAC enhances cross-embodiment generalization in simulated environments.

02

PEAC improves adaptation performance in real-world legged locomotion tasks.

03

The method integrates seamlessly with existing unsupervised RL approaches.

Abstract

Designing generalizable agents capable of adapting to diverse embodiments has achieved significant attention in Reinforcement Learning (RL), which is critical for deploying RL agents in various real-world applications. Previous Cross-Embodiment RL approaches have focused on transferring knowledge across embodiments within specific tasks. These methods often result in knowledge tightly coupled with those tasks and fail to adequately capture the distinct characteristics of different embodiments. To address this limitation, we introduce the notion of Cross-Embodiment Unsupervised RL (CEURL), which leverages unsupervised learning to enable agents to acquire embodiment-aware and task-agnostic knowledge through online interactions within reward-free environments. We formulate CEURL as a novel Controlled Embodiment Markov Decision Process (CE-MDP) and systematically analyze CEURL's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thu-ml/CEURL
pytorchOfficial

Videos

PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Anomaly Detection Techniques and Applications