Semantic Granularity Navigation in Image Editing

Liangsi Lu; Minzhe Guo; Xuhang Chen; Yang Shi

arXiv:2605.21190·cs.CV·May 21, 2026

Semantic Granularity Navigation in Image Editing

Liangsi Lu, Minzhe Guo, Xuhang Chen, Yang Shi

PDF

TL;DR

NaviEdit is a training-free inference controller that decouples semantic editability from model scale in image editing, improving fidelity and responsiveness without changing the underlying model.

Contribution

It introduces a novel inference-time control method that separates edit progress from model scale, enhancing image editing capabilities in diffusion models.

Findings

01

Supports decoupling as a portable inference-time control principle

02

Achieves positive gains across various editors and flow backbones

03

Improves semantic responsiveness without retraining models

Abstract

Despite the generative capabilities of diffusion and flow models, real-image editing remains constrained by a persistent trade-off between semantic editability and structural fidelity. We trace a primary cause of this limitation to the implicit coupling of edit progress with model scale in existing paradigms. Under this coupling, stronger edits typically require visiting noisier states, which spends computation on destabilizing layout before the semantic change is well localized. We introduce NaviEdit, a training-free inference-time controller that decouples edit progress from model scale traversal through a strict self-consistency contract. NaviEdit operates at the rollout level and leaves the underlying pretrained model unchanged. It treats scale as a control input and reallocates a fixed step budget toward semantically responsive intermediate scales instead of destructive high-noise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.