Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution
Zhanyi Sun, Shuran Song

TL;DR
Latent Policy Barrier (LPB) enhances visuomotor policy robustness by maintaining in-distribution states through a two-module system, improving data efficiency and reliability without extra human effort.
Contribution
LPB introduces a novel framework inspired by Control Barrier Functions that separates expert imitation from out-of-distribution recovery in visuomotor learning.
Findings
LPB improves policy robustness in simulated and real-world tasks.
LPB enhances data efficiency, reducing the need for extensive expert data.
LPB achieves reliable manipulation without additional human correction.
Abstract
Visuomotor policies trained via behavior cloning are vulnerable to covariate shift, where small deviations from expert trajectories can compound into failure. Common strategies to mitigate this issue involve expanding the training distribution through human-in-the-loop corrections or synthetic data augmentation. However, these approaches are often labor-intensive, rely on strong task assumptions, or compromise the quality of imitation. We introduce Latent Policy Barrier, a framework for robust visuomotor policy learning. Inspired by Control Barrier Functions, LPB treats the latent embeddings of expert demonstrations as an implicit barrier separating safe, in-distribution states from unsafe, out-of-distribution (OOD) ones. Our approach decouples the role of precise expert imitation and OOD recovery into two separate modules: a base diffusion policy solely on expert data, and a dynamics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
