Reimagining Parameter Space Exploration with Diffusion Models
Lijun Zhang, Xiao Liu, Hui Guan

TL;DR
This paper proposes using diffusion models to generate task-specific neural network parameters directly from task identifiers, offering a potential alternative to traditional fine-tuning, with promising results on seen tasks but limitations on unseen tasks.
Contribution
Introducing a diffusion model-based method for direct generation of task-specific neural network parameters from task identity, bypassing task-specific training.
Findings
Diffusion models can generate accurate parameters for seen tasks.
Supports multi-task interpolation when parameter subspaces are well-structured.
Struggles to generalize to unseen tasks.
Abstract
Adapting neural networks to new tasks typically requires task-specific fine-tuning, which is time-consuming and reliant on labeled data. We explore a generative alternative that produces task-specific parameters directly from task identity, eliminating the need for task-specific training. To this end, we propose using diffusion models to learn the underlying structure of effective task-specific parameter space and synthesize parameters on demand. Once trained, the task-conditioned diffusion model can generate specialized weights directly from task identifiers. We evaluate this approach across three scenarios: generating parameters for a single seen task, for multiple seen tasks, and for entirely unseen tasks. Experiments show that diffusion models can generate accurate task-specific parameters and support multi-task interpolation when parameter subspaces are well-structured, but fail to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
