FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models
Yucheng Xie, Fu Feng, Ruixiao Shi, Jianlu Shen, Jing Wang, Yong Rui, Xin Geng

TL;DR
FINE introduces a novel pre-training approach that factorizes knowledge into fundamental components called learngenes, enabling flexible initialization of variable-sized diffusion models without extensive retraining.
Contribution
FINE's method of representing model weights as a product of shared learngenes and layer-specific factors allows efficient, transferable initialization for models of different sizes, reducing training costs.
Findings
Achieves state-of-the-art performance in variable-sized model initialization.
Enables effective adaptation of models to diverse tasks.
Reduces retraining effort to light fine-tuning of layer-specific components.
Abstract
The training of diffusion models is computationally intensive, making effective pre-training essential. However, real-world deployments often demand models of variable sizes due to diverse memory and computational constraints, posing challenges when corresponding pre-trained versions are unavailable. To address this, we propose FINE, a novel pre-training method whose resulting model can flexibly factorize its knowledge into fundamental components, termed learngenes, enabling direct initialization of models of various sizes and eliminating the need for repeated pre-training. Rather than optimizing a conventional full-parameter model, FINE represents each layer's weights as the product of , , and , where and serve as size-agnostic learngenes shared across layers, while remains layer-specific.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Model-Driven Software Engineering Techniques
