TL;DR
A2A introduces a new policy approach for robotics that uses historical actions as initialization, enabling fast, high-quality action generation with fewer inference steps and improved robustness.
Contribution
A2A shifts from random noise sampling to informed initialization using proprioceptive history, reducing inference steps and enhancing generalization in diffusion-based policies.
Findings
A2A achieves high-quality actions in a single inference step.
A2A demonstrates robustness to visual perturbations.
A2A extends effectively to video generation.
Abstract
Diffusion-based policies have recently achieved remarkable success in robotics by formulating action prediction as a conditional denoising process. However, the standard practice of sampling from random Gaussian noise often requires multiple iterative steps to produce clean actions, leading to high inference latency that incurs a major bottleneck for real-time control. In this paper, we challenge the necessity of uninformed noise sampling and propose Action-to-Action flow matching (A2A), a novel policy paradigm that shifts from random sampling to initialization informed by the previous proprioceptive action. Unlike existing methods that treat proprioceptive action feedback as static conditions, A2A leverages historical proprioceptive sequences, embedding them into a high-dimensional latent space as the starting point for action generation. This design bypasses costly iterative denoising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
