FACM: Flow-Anchored Consistency Models
Yansong Peng, Kai Zhu, Yu Liu, Pingyu Wu, Hebei Li, Xiaoyan Sun, Feng Wu

TL;DR
FACM introduces a flow-anchored approach to consistency models, improving training stability and generation efficiency by explicitly linking the model to the underlying flow, achieving state-of-the-art results on ImageNet.
Contribution
The paper proposes a novel flow-anchoring technique with an expanded time interval strategy, enabling stable training and scalable high-quality image generation.
Findings
Achieves state-of-the-art FID of 1.32 with 2 steps on ImageNet.
Scales to 14B parameters with improved inference speed.
Develops a memory-efficient Chain-JVP for large-scale training.
Abstract
Continuous-time Consistency Models (CMs) promise efficient few-step generation but face significant challenges with training instability. We argue this instability stems from a fundamental conflict: Training the network exclusively on a shortcut objective leads to the catastrophic forgetting of the instantaneous velocity field that defines the flow. Our solution is to explicitly anchor the model in the underlying flow, ensuring high trajectory fidelity during training. We introduce the Flow-Anchored Consistency Model (FACM), where a Flow Matching (FM) task serves as a dynamic anchor for the primary CM shortcut objective. Key to this Flow-Anchoring approach is a novel expanded time interval strategy that unifies optimization for a single model while decoupling the two tasks to ensure stable, architecturally-agnostic training. By distilling a pre-trained LightningDiT model, our method…
Peer Reviews
Decision·ICLR 2026 Poster
- **[S1] This paper is original in the aspect that it provides a new perspective into training instability of CMs.** Specifically, the authors hypothesize that CM training instability arises from missing flow velocity supervision, and that one should decouple flow and mean velocity time conditions to mitigate conflict between the two velocities. - **[S2] This paper is significant in the aspect that it provides a number of techniques for scaling CMs.** The authors provide a number of practical t
- **[W1] The paper lacks theoretical novelty, in the sense that it is a special case of MeanFlow.** MeanFlow learns a flow map between all time pairs $(t,r)$ for $0 \leq t < r \leq 1$, along with a flow matching loss at $t = r$. FACM is a special instance of MeanFlow where a flow map is learned only for time pairs $(t,1)$ for $0 \leq t \leq 1$, also with flow matching loss at $t = r$, where $r = 2 - c\_{FM}$ if one uses expanded time interval proposed in Section 3.3.2. Under this perspective, FA
1. The paper identifies a clear and well-motivated problem — instability in continuous-time consistency model training — and provides a principled remedy via flow anchoring. 2. The formulation is elegant, combining the benefits of flow matching (stability, theoretical grounding) and consistency models (few-step generation) into a unified loss. 3. The expanded time interval trick is simple yet effective, ensuring a smooth transition between FM and CM regions while avoiding gradient coupling issue
1. The theoretical discussion remains somewhat heuristic. While the “anchoring” intuition is appealing, a more rigorous analysis (e.g., convergence guarantees or stability proofs) would strengthen the contribution. 2. The relationship between FACM and existing flow–consistency hybrids (e.g., MeanFlow, iCM) could be clarified — in particular, what distinguishes the proposed anchoring mechanism from joint FM/CM training used before. 3. Some ablations (e.g., varying the relative weight of FM vs. CM
1. The analysis on the training instability issue in continuous-time consistency models is clear and convincing 2. The proposed flow-anchoring solution is elegant and effective, and the process of dual-objective training is clearly presented. 3. The achieved experiment result is impressive, where in table 1, FACM@NFE=2 outperforms Multi-NFE baseline of LightningDiT.
1. Although the paper shows efficiency in terms of reduced NFE, it would be good to also report the total generation time per image. 2. The individual components of the proposed algorithm including the flow matching loss and consistency model loss are existing work, and the proposed FSDP compatible Chain-JVP scheme still has significant computational overhead
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
MethodsConsistency Models
