Free Lunch for Stabilizing Rectified Flow Inversion
Chenru Wang, Beier Zhu, Chi Zhang

TL;DR
This paper introduces Proximal-Mean Inversion and mimic-CFG, novel gradient correction methods for rectified flow models that improve stability, reconstruction, and editing quality without additional training.
Contribution
The paper proposes training-free gradient correction techniques that stabilize rectified flow inversion, enhancing reconstruction and editing performance with theoretical guarantees.
Findings
Significantly improved inversion stability and reconstruction quality.
Enhanced editing fidelity with reduced neural evaluations.
Achieved state-of-the-art results on PIE-Bench.
Abstract
Rectified-Flow (RF)-based generative models have recently emerged as strong alternatives to traditional diffusion models, demonstrating state-of-the-art performance across various tasks. By learning a continuous velocity field that transforms simple noise into complex data, RF-based models not only enable high-quality generation, but also support training-free inversion, which facilitates downstream tasks such as reconstruction and editing. However, existing inversion methods, such as vanilla RF-based inversion, suffer from approximation errors that accumulate across timesteps, leading to unstable velocity fields and degraded reconstruction and editing quality. To address this challenge, we propose Proximal-Mean Inversion (PMI), a training-free gradient correction method that stabilizes the velocity field by guiding it toward a running average of past velocities, constrained within a…
Peer Reviews
Decision·ICLR 2026 Poster
* The paper introduces a theoretically grounded proximal correction framework supported by rigorous stability and error analyses.. * The PMI formulation is elegant and practical, addresses the instability and accumulated inversion errors in flow-based generative models. * Extensive quantitative and qualitative evaluations on PIE-Bench demonstrate consistent improvements across multiple baselines, in both inversion and editing tasks. * The paper is well written, clearly organized, and easy to fol
* Although the focus is on inversion-based editing, it would strengthen the paper to compare against a broader set of diffusion and flow-based editing baselines on PIE-Bench. * Including the full benchmark results or additional comparisons with diffusion-based methods (e.g., DDIM inversion variants) would provide clearer context. * Missing relevant recent works such as InfEdit [1], which explores inversion-free diffusion model based editing, and FlowEdit [3] / FlowChef [2], which also employ flo
1. The proposed methods are techically sounds and supported by theoretical proofs. 2. The approach achieves state-of-the-art quality on PIE-Bench with fewer sampling steps and no additional NFEs, accelerating the model. 3. Mimic-CFG provides an efficient guidance mechanism for editing on top of CFG, balancing structural consistency and editing control according to the experiment results.
1. The performance of the proposed methods seems to be dependent on hyperparameter selection, and could have potential for overcorrection. Over-correction (small $w$) harms both background preservation and editing quality, despite better SSIM/PSNR improvements. Similarly, for the proximal operator parameter $\lambda$ in editing, large values can lead to overcorrection, compromising editing quality. 2. How sensitive is the performance to the hyper-paraetmer across different RF models? 3. While th
- The motivation of this paper is clear. The inaccuracy of inversion is a noticeable problem of RF-based methods, where many prior works has been attempting to eliminate this. The proposed method aims to address this problem through gradient correction, which is intuitive and promising. - The derivation of PMI is mathematically detailed, including closed-form updates (see Proposition 1 and Appendix A.1), and analysis of error bounds are also provided (Proposition 2, Appendix A.3). - Experiments
- Some ablation studies are missed. As stated in Line 238, authors use the first-order Taylor Expansion to estimate the objective in Equation (10). Authors are expected to conduct experiments to demonstrate how does the expansion order influence the performance. - The settings of baseline methods is not clearly stated in the paper. Some hyperparameters in baseline methods are crucial to their performance (such as the feature sharing choices in FireFlow and RF-Solver). Empirically, the flaws of
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis · Seismic Imaging and Inversion Techniques
