Learning Spatial-Aware Manipulation Ordering

Yuxiang Yan; Zhiyuan Zhou; Xin Gao; Guanghao Li; Shenglin Li; Jiaqi Chen; Qunyan Pu; Jian Pu

arXiv:2510.25138·cs.RO·January 1, 2026

Learning Spatial-Aware Manipulation Ordering

Yuxiang Yan, Zhiyuan Zhou, Xin Gao, Guanghao Li, Shenglin Li, Jiaqi Chen, Qunyan Pu, Jian Pu

PDF

1 Video

TL;DR

OrderMind is a novel framework that learns spatial-aware manipulation sequences in cluttered environments, improving the efficiency and robustness of robotic manipulation through spatial context encoding and supervision from a vision-language model.

Contribution

We introduce OrderMind, a unified spatial-aware manipulation ordering framework that leverages spatial context encoding and a spatial prior labeling method for improved manipulation planning.

Findings

01

Outperforms prior methods in effectiveness and efficiency

02

Successfully applied in both simulation and real-world environments

03

Handles complex cluttered scenes with high accuracy

Abstract

Manipulation in cluttered environments is challenging due to spatial dependencies among objects, where an improper manipulation order can cause collisions or blocked access. Existing approaches often overlook these spatial relationships, limiting their flexibility and scalability. To address these limitations, we propose OrderMind, a unified spatial-aware manipulation ordering framework that directly learns object manipulation priorities based on spatial context. Our architecture integrates a spatial context encoder with a temporal priority structuring module. We construct a spatial graph using k-Nearest Neighbors to aggregate geometric information from the local layout and encode both object-object and object-manipulator interactions to support accurate manipulation ordering in real-time. To generate physically and semantically plausible supervision signals, we introduce a spatial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning Spatial-Aware Manipulation Ordering· slideslive