LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation

Jingjing Wang; Zhengdong Hong; Chong Bao; Yuke Zhu; Junhan Sun; Guofeng Zhang

arXiv:2604.08475·cs.CV·April 22, 2026

LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation

Jingjing Wang, Zhengdong Hong, Chong Bao, Yuke Zhu, Junhan Sun, Guofeng Zhang

PDF

1 Repo

TL;DR

LAMP leverages image-editing as 3D priors to extract detailed 3D transformations, enabling precise and generalizable open-world manipulation in robotics.

Contribution

It introduces a novel method that lifts 2D image-editing cues into 3D representations for improved manipulation tasks.

Findings

01

Achieves accurate 3D transformations in manipulation tasks.

02

Demonstrates strong zero-shot generalization in open-world scenarios.

03

Outperforms existing methods in fine-grained spatial reasoning.

Abstract

Human-like generalization in open-world remains a fundamental challenge for robotic manipulation. Existing learning-based methods, including reinforcement learning, imitation learning, and vision-language-action-models (VLAs), often struggle with novel tasks and unseen environments. Another promising direction is to explore generalizable representations that capture fine-grained spatial and geometric relations for open-world manipulation. While large-language-model (LLMs) and vision-language-model (VLMs) provide strong semantic reasoning based on language or annotated 2D representations, their limited 3D awareness restricts their applicability to fine-grained manipulation. To address this, we propose LAMP, which lifts image-editing as 3D priors to extract inter-object 3D transformations as continuous, geometry-aware representations. Our key insight is that image-editing inherently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://zju3dv.github.io/LAMP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.