The Quadratic Geometry of Flow Matching: Semantic Granularity Alignment for Text-to-Image Synthesis
Zhinan Xiong, Shunqi Yuan

TL;DR
This paper introduces a geometric perspective on flow matching in generative models, revealing how neural tangent kernels influence training dynamics, and proposes Semantic Granularity Alignment to improve text-to-image synthesis.
Contribution
It provides a novel geometric analysis of flow matching optimization dynamics and introduces SGA, a method to enhance training efficiency and output quality in text-to-image models.
Findings
SGA accelerates convergence in text-to-image synthesis.
SGA improves structural integrity of generated images.
Geometric analysis reveals neural tangent kernel's role in training dynamics.
Abstract
In this work, we analyze the optimization dynamics of generative fine-tuning. We observe that under the Flow Matching framework, the standard MSE objective can be formulated as a Quadratic Form governed by a dynamically evolving Neural Tangent Kernel (NTK). This geometric perspective reveals a latent Data Interaction Matrix, where diagonal terms represent independent sample learning and off-diagonal terms encode residual correlation between heterogeneous features. Although standard training implicitly optimizes these cross-term interferences, it does so without explicit control; moreover, the prevailing data-homogeneity assumption may constrain the model's effective capacity. Motivated by this insight, we propose Semantic Granularity Alignment (SGA), using Text-to-Image synthesis as a testbed. SGA engineers targeted interventions in the vector residual field to mitigate gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Music Technology and Sound Studies
