FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models

Yucheng Xie; Fu Feng; Ruixiao Shi; Jianlu Shen; Jing Wang; Yong Rui; Xin Geng

arXiv:2409.19289·cs.CV·March 5, 2026

FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models

Yucheng Xie, Fu Feng, Ruixiao Shi, Jianlu Shen, Jing Wang, Yong Rui, Xin Geng

PDF

Open Access

TL;DR

FINE introduces a novel pre-training approach that factorizes knowledge into fundamental components called learngenes, enabling flexible initialization of variable-sized diffusion models without extensive retraining.

Contribution

FINE's method of representing model weights as a product of shared learngenes and layer-specific factors allows efficient, transferable initialization for models of different sizes, reducing training costs.

Findings

01

Achieves state-of-the-art performance in variable-sized model initialization.

02

Enables effective adaptation of models to diverse tasks.

03

Reduces retraining effort to light fine-tuning of layer-specific components.

Abstract

The training of diffusion models is computationally intensive, making effective pre-training essential. However, real-world deployments often demand models of variable sizes due to diverse memory and computational constraints, posing challenges when corresponding pre-trained versions are unavailable. To address this, we propose FINE, a novel pre-training method whose resulting model can flexibly factorize its knowledge into fundamental components, termed learngenes, enabling direct initialization of models of various sizes and eliminating the need for repeated pre-training. Rather than optimizing a conventional full-parameter model, FINE represents each layer's weights as the product of $U_{⋆}$ , $Σ_{⋆}^{(l)}$ , and $V_{⋆}^{⊤}$ , where $U_{⋆}$ and $V_{⋆}$ serve as size-agnostic learngenes shared across layers, while $Σ_{⋆}^{(l)}$ remains layer-specific.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications · Model-Driven Software Engineering Techniques