PARSE: Part-Aware Relational Spatial Modeling

Yinuo Bai; Peijun Xu; Kuixiang Shao; Yuyang Jiao; Jingxuan Zhang; Kaixin Yao; Jiayuan Gu; Jingyi Yu

arXiv:2603.07704·cs.CV·March 10, 2026

PARSE: Part-Aware Relational Spatial Modeling

Yinuo Bai, Peijun Xu, Kuixiang Shao, Yuyang Jiao, Jingxuan Zhang, Kaixin Yao, Jiayuan Gu, Jingyi Yu

PDF

Open Access

TL;DR

PARSE introduces a part-aware framework for modeling object interactions at the part level, enabling more accurate and physically consistent 3D scene reasoning and generation.

Contribution

It presents a novel part-centric assembly graph and spatial solver, along with a large dataset, to improve geometry-grounded spatial reasoning in 3D scene modeling.

Findings

01

Enhanced object layout reasoning after fine-tuning Qwen3-VL on PARSE-10K.

02

Scenes generated with PAGs show improved physical realism and structural complexity.

03

PARSE advances the state of the art in geometry-grounded spatial reasoning.

Abstract

Inter-object relations underpin spatial intelligence, yet existing representations -- linguistic prepositions or object-level scene graphs -- are too coarse to specify which regions actually support, contain, or contact one another, leading to ambiguous and physically inconsistent layouts. To address these ambiguities, a part-level formulation is needed; therefore, we introduce PARSE, a framework that explicitly models how object parts interact to determine feasible and spatially grounded scene configurations. PARSE centers on the Part-centric Assembly Graph (PAG), which encodes geometric relations between specific object parts, and a Part-Aware Spatial Configuration Solver that converts these relations into geometric constraints to assemble collision-free, physically valid scenes. Using PARSE, we build PARSE-10K, a dataset of 10,000 3D indoor scenes constructed from real-image layout…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Robot Manipulation and Learning · Robotics and Sensor-Based Localization