Visual Prompt Guided Unified Pushing Policy
Hieu Bui, Ziyan Gao, Yuya Hosoda, Joo-Ho Lee

TL;DR
This paper introduces a unified visual prompt guided pushing policy that enhances the efficiency and versatility of robotic pushing actions across various scenarios, outperforming existing methods.
Contribution
It proposes a novel flow matching policy with visual prompts for reactive, multimodal pushing, enabling broad applicability and integration into planning frameworks.
Findings
Outperforms existing pushing methods in efficiency and versatility
Serves effectively as a low-level primitive in planning frameworks
Successfully applied to table-cleaning tasks
Abstract
As one of the simplest non-prehensile manipulation skills, pushing has been widely studied as an effective means to rearrange objects. Existing approaches, however, typically rely on multi-step push plans composed of pre-defined pushing primitives with limited application scopes, which restrict their efficiency and versatility across different scenarios. In this work, we propose a unified pushing policy that incorporates a lightweight prompting mechanism into a flow matching policy to guide the generation of reactive, multimodal pushing actions. The visual prompt can be specified by a high-level planner, enabling the reuse of the pushing policy across a wide range of planning problems. Experimental results demonstrate that the proposed unified pushing policy not only outperforms existing baselines but also effectively serves as a low-level primitive within a VLM-guided planning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · AI-based Problem Solving and Planning · Robot Manipulation and Learning
