Imitation Learning Policy based on Multi-Step Consistent Integration Shortcut Model
Yu Fang, Xinyu Wang, Xuehe Zhang, Wanli Xue, Mingwei Zhang, Shengyong Chen, Jie Zhao

TL;DR
This paper introduces a multi-step integration shortcut model for robot imitation learning that significantly improves inference speed while maintaining high performance, validated through simulation and real-world experiments.
Contribution
It proposes a novel one-step shortcut method with multi-step integration and an adaptive gradient allocation to enhance stability and efficiency in robot imitation learning.
Findings
Improved inference speed with maintained performance.
Effective in both simulation and real-world tasks.
Stable optimization achieved through adaptive gradient allocation.
Abstract
The wide application of flow-matching methods has greatly promoted the development of robot imitation learning. However, these methods all face the problem of high inference time. To address this issue, researchers have proposed distillation methods and consistency methods, but the performance of these methods still struggles to compete with that of the original diffusion models and flow-matching models. In this article, we propose a one-step shortcut method with multi-step integration for robot imitation learning. To balance the inference speed and performance, we extend the multi-step consistency loss on the basis of the shortcut model, split the one-step loss into multi-step losses, and improve the performance of one-step inference. Secondly, to solve the problem of unstable optimization of the multi-step loss and the original flow-matching loss, we propose an adaptive gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Human Pose and Action Recognition
