SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
Yicheng Xiao, Wenhu Zhang, Lin Song, Yukang Chen, Wenbo Li, Nan Jiang, Tianhe Ren, Haokun Lin, Wei Huang, Haoyang Huang, Xiu Li, Nan Duan, Xiaojuan Qi

TL;DR
SpatialEdit introduces a comprehensive benchmark, synthetic dataset, and baseline model for evaluating and advancing fine-grained image spatial editing, emphasizing geometric fidelity and perceptual plausibility.
Contribution
The paper presents SpatialEdit-Bench, SpatialEdit-500k dataset, and SpatialEdit-16B baseline model, advancing evaluation and training resources for spatial image editing.
Findings
SpatialEdit-16B outperforms prior methods on spatial manipulation tasks.
The benchmark effectively measures perceptual plausibility and geometric fidelity.
Synthetic data enables scalable training with precise ground-truth transformations.
Abstract
Image spatial editing performs geometry-driven transformations, allowing precise control over object layout and camera viewpoints. Current models are insufficient for fine-grained spatial manipulations, motivating a dedicated assessment suite. Our contributions are listed: (i) We introduce SpatialEdit-Bench, a complete benchmark that evaluates spatial editing by jointly measuring perceptual plausibility and geometric fidelity via viewpoint reconstruction and framing analysis. (ii) To address the data bottleneck for scalable training, we construct SpatialEdit-500k, a synthetic dataset generated with a controllable Blender pipeline that renders objects across diverse backgrounds and systematic camera trajectories, providing precise ground-truth transformations for both object- and camera-centric operations. (iii) Building on this data, we develop SpatialEdit-16B, a baseline model for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
