TL;DR
DirectEdit offers a novel flow-based image editing method that achieves precise inversion and high-fidelity editing without additional neural evaluations, outperforming existing techniques.
Contribution
It introduces a direct alignment approach for inversion in flow models, eliminating reconstruction errors and improving image editing accuracy and efficiency.
Findings
Outperforms state-of-the-art image editing methods in accuracy.
Achieves precise inversion without extra neural function evaluations.
Balances fidelity and editability effectively.
Abstract
With recent advancements in large-scale pre-trained text-to-image (T2I) models, training-free image editing methods have demonstrated remarkable success. Typically, these methods involve adding noise to a clean image via an inversion process, followed by separate denoising steps for the reconstruction and editing paths during the forward process. However, since the reconstruction path is approximated using noisy latents from mismatched timesteps, existing methods inevitably suffer from accumulated drift, which fundamentally limits reconstruction fidelity. To address this challenge, we systematically analyze the inversion process within the flow transformer and propose DirectEdit, a simple yet effective editing method that eliminates the inherent reconstruction error without introducing additional neural function evaluations (NFEs). Unlike most prior works that attempt to rectify the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
