Perception Stitching: Zero-Shot Perception Encoder Transfer for Visuomotor Robot Policies
Pingcheng Jian, Easop Lee, Zachary Bell, Michael M. Zavlanos, Boyuan, Chen

TL;DR
Perception Stitching enables robots to adapt to large visual changes in real-time by modularly combining visual encoders, significantly improving zero-shot visuomotor task success in simulation and real-world scenarios.
Contribution
The paper introduces a modular visual encoder approach that allows zero-shot adaptation to visual changes by stitching encoders, enhancing visuomotor policy flexibility.
Findings
Achieved zero-shot success in real-world manipulation tasks.
Outperformed baseline methods in various simulated and real-world tests.
Provided insights into the learned feature representations.
Abstract
Vision-based imitation learning has shown promising capabilities of endowing robots with various motion skills given visual observation. However, current visuomotor policies fail to adapt to drastic changes in their visual observations. We present Perception Stitching that enables strong zero-shot adaptation to large visual changes by directly stitching novel combinations of visual encoders. Our key idea is to enforce modularity of visual encoders by aligning the latent visual features among different visuomotor policies. Our method disentangles the perceptual knowledge with the downstream motion skills and allows the reuse of the visual encoders by directly stitching them to a policy network trained with partially different visual conditions. We evaluate our method in various simulated and real-world manipulation tasks. While baseline methods failed at all attempts, our method could…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Tactile and Sensory Interactions
