TL;DR
SplitFlow introduces a novel flow decomposition and aggregation method for inversion-free text-to-image editing, improving semantic fidelity and attribute disentanglement over existing zero-shot approaches.
Contribution
The paper proposes a flow decomposition-and-aggregation framework with a soft-aggregation mechanism to enhance image editing quality without inversion.
Findings
Outperforms existing zero-shot editing methods in semantic fidelity.
Enhances diversity and consistency in edited images.
Effectively suppresses semantic redundancy during editing.
Abstract
Rectified flow models have become a de facto standard in image generation due to their stable sampling trajectories and high-fidelity outputs. Despite their strong generative capabilities, they face critical limitations in image editing tasks: inaccurate inversion processes for mapping real images back into the latent space, and gradient entanglement issues during editing often result in outputs that do not faithfully reflect the target prompt. Recent efforts have attempted to directly map source and target distributions via ODE-based approaches without inversion; however,these methods still yield suboptimal editing quality. In this work, we propose a flow decomposition-and-aggregation framework built upon an inversion-free formulation to address these limitations. Specifically, we semantically decompose the target prompt into multiple sub-prompts, compute an independent flow for each,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
