Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers

Yiqing Shi; Yiren Song; Mike Zheng Shou

arXiv:2511.18673·cs.CV·November 25, 2025

Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers

Yiqing Shi, Yiren Song, Mike Zheng Shou

PDF

Open Access

TL;DR

Edit2Perceive leverages image editing diffusion models to achieve state-of-the-art dense perception across depth, normal, and matting tasks, emphasizing structure preservation and efficiency.

Contribution

The paper introduces a unified diffusion framework, Edit2Perceive, that adapts editing models for dense perception tasks with structure-preserving refinement and faster inference.

Findings

01

State-of-the-art results across depth, normal, and matting tasks

02

Effective structure-preserving refinement during denoising

03

Faster inference with single-step deterministic approach

Abstract

Recent advances in diffusion transformers have shown remarkable generalization in visual synthesis, yet most dense perception methods still rely on text-to-image (T2I) generators designed for stochastic generation. We revisit this paradigm and show that image editing diffusion models are inherently image-to-image consistent, providing a more suitable foundation for dense perception task. We introduce Edit2Perceive, a unified diffusion framework that adapts editing models for depth, normal, and matting. Built upon the FLUX.1 Kontext architecture, our approach employs full-parameter fine-tuning and a pixel-space consistency loss to enforce structure-preserving refinement across intermediate denoising states. Moreover, our single-step deterministic inference yields up to faster runtime while training on relatively small datasets. Extensive experiments demonstrate comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Humanities and Scholarship · Cell Image Analysis Techniques