Polybot: Training One Policy Across Robots While Embracing Variability
Jonathan Yang, Dorsa Sadigh, Chelsea Finn

TL;DR
Polybot introduces a unified policy training approach that effectively transfers manipulation skills across diverse robotic platforms by aligning observation spaces and internal representations, improving success rates and efficiency.
Contribution
The paper presents a novel framework for training a single policy applicable to multiple robots by combining observation/action space alignment and contrastive learning of internal representations.
Findings
Significant success rate improvements across different robots.
Enhanced sample efficiency with new task data.
Effective domain shift bridging through contrastive learning.
Abstract
Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets. However, robotic platforms possess varying control schemes, camera viewpoints, kinematic configurations, and end-effector morphologies, posing significant challenges when transferring manipulation skills from one platform to another. To tackle this problem, we propose a set of key design decisions to train a single policy for deployment on multiple robotic platforms. Our framework first aligns the observation and action spaces of our policy across embodiments via utilizing wrist cameras and a unified, but modular codebase. To bridge the remaining domain shift, we align our policy's internal representations across embodiments through contrastive learning. We evaluate our method on a dataset collected over 60 hours spanning 6 tasks and 3…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Stroke Rehabilitation and Recovery · Reinforcement Learning in Robotics
MethodsALIGN
