Sample from What You See: Visuomotor Policy Learning via Diffusion Bridge with Observation-Embedded Stochastic Differential Equation

Zhaoyang Liu; Mokai Pan; Zhongyi Wang; Kaizhen Zhu; Haotao Lu; Haipeng Zhang; Jingya Wang; Ye Shi

arXiv:2512.07212·cs.AI·February 5, 2026

Sample from What You See: Visuomotor Policy Learning via Diffusion Bridge with Observation-Embedded Stochastic Differential Equation

Zhaoyang Liu, Mokai Pan, Zhongyi Wang, Kaizhen Zhu, Haotao Lu, Haipeng Zhang, Jingya Wang, Ye Shi

PDF

Open Access

TL;DR

BridgePolicy introduces a diffusion-bridge approach that embeds observations into the stochastic dynamics of visuomotor policies, enabling more precise control by starting sampling from an observation-informed prior.

Contribution

The paper presents a novel diffusion-bridge formulation for imitation learning that integrates heterogeneous observations directly into the stochastic process, improving robotic control performance.

Findings

01

Outperforms state-of-the-art generative policies across benchmarks

02

Improves control precision and reliability in real-world tasks

03

Effectively unifies visual and state inputs for diffusion models

Abstract

Imitation learning with diffusion models has advanced robotic control by capturing the multi-modal action distributions. However, existing methods typically treat observations only as high-level conditions to the denoising network, rather than integrating them into the stochastic dynamics of the diffusion process itself. As a result, the sampling is forced to begin from random noise, weakening the coupling between perception and control and often yielding suboptimal performance. We propose BridgePolicy, a generative visuomotor policy that directly integrates observations into the stochastic dynamics via a diffusion-bridge formulation. By constructing an observation-informed trajectory, BridgePolicy enables sampling to start from a rich and informative prior rather than random noise, substantially improving precision and reliability in control. A key difficulty is that diffusion bridge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis