Ada3Drift: Adaptive Training-Time Drifting for One-Step 3D Visuomotor Robotic Manipulation
Chongyang Xu, Yixian Zou, Ziliang Feng, Fanman Meng, Shuaicheng Liu

TL;DR
Ada3Drift introduces a training-time drifting field to improve multimodal action generation in robotic manipulation, achieving high-fidelity single-step outputs with fewer function evaluations.
Contribution
It shifts the iterative refinement process from inference to training, enabling real-time, multimodal 3D visuomotor control with fewer computational resources.
Findings
State-of-the-art performance on multiple benchmarks
Achieves 10x fewer function evaluations than diffusion methods
Effective in real-world robotic tasks
Abstract
Diffusion-based visuomotor policies effectively capture multimodal action distributions through iterative denoising, but their high inference latency limits real-time robotic control. Recent flow matching and consistency-based methods achieve single-step generation, yet sacrifice the ability to preserve distinct action modes, collapsing multimodal behaviors into averaged, often physically infeasible trajectories. We observe that the compute budget asymmetry in robotics (offline training vs.\ real-time inference) naturally motivates recovering this multimodal fidelity by shifting iterative refinement from inference time to training time. Building on this insight, we propose Ada3Drift, which learns a training-time drifting field that attracts predicted actions toward expert demonstration modes while repelling them from other generated samples, enabling high-fidelity single-step generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis
