Semantic Granularity Navigation in Image Editing
Liangsi Lu, Minzhe Guo, Xuhang Chen, Yang Shi

TL;DR
NaviEdit is a training-free inference controller that decouples semantic editability from model scale in image editing, improving fidelity and responsiveness without changing the underlying model.
Contribution
It introduces a novel inference-time control method that separates edit progress from model scale, enhancing image editing capabilities in diffusion models.
Findings
Supports decoupling as a portable inference-time control principle
Achieves positive gains across various editors and flow backbones
Improves semantic responsiveness without retraining models
Abstract
Despite the generative capabilities of diffusion and flow models, real-image editing remains constrained by a persistent trade-off between semantic editability and structural fidelity. We trace a primary cause of this limitation to the implicit coupling of edit progress with model scale in existing paradigms. Under this coupling, stronger edits typically require visiting noisier states, which spends computation on destabilizing layout before the semantic change is well localized. We introduce NaviEdit, a training-free inference-time controller that decouples edit progress from model scale traversal through a strict self-consistency contract. NaviEdit operates at the rollout level and leaves the underlying pretrained model unchanged. It treats scale as a control input and reallocates a fixed step budget toward semantically responsive intermediate scales instead of destructive high-noise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
