AeroPlace-Flow: Language-Grounded Object Placement for Aerial Manipulators via Visual Foresight and Object Flow
Sarthak Mishra, Rishabh Dev Yadav, Naveen Nair, Wei Pan, and Spandan Roy

TL;DR
AeroPlace-Flow is a training-free framework that enables aerial manipulators to perform precise, language-guided object placement by synthesizing goal images, grounding them in 3D space, and planning collision-aware object flow, validated through simulation and real-world tests.
Contribution
It introduces AeroPlace-Flow, a novel language-grounded object placement method that combines visual foresight, explicit 3D reasoning, and object flow without requiring training.
Findings
Achieves 75% success rate in real-world experiments.
Performs reliable language-conditioned placement in diverse scenarios.
Operates without predefined poses or task-specific training.
Abstract
Precise object placement remains underexplored in aerial manipulation, where most systems rely on predefined target coordinates and focus primarily on grasping and control. Specifying exact placement poses, however, is cumbersome in real-world settings, where users naturally communicate goals through language. In this work, we present AeroPlace-Flow, a training-free framework for language-grounded aerial object placement that unifies visual foresight with explicit 3D geometric reasoning and object flow. Given RGB-D observations of the object and the placement scene, along with a natural language instruction, AeroPlace-Flow first synthesizes a task-complete goal image using image editing models. The imagined configuration is then grounded into metric 3D space through depth alignment and object-centric reasoning, enabling the inference of a collision-aware object flow that transports the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotic Path Planning Algorithms · Multimodal Machine Learning Applications
