Conditional LoRA Parameter Generation
Xiaolong Jin, Kai Wang, Dongwen Tang, Wangbo Zhao, Yukun Zhou, Junshu, Tang, Yang You

TL;DR
This paper introduces COND P-DIFF, a conditional diffusion model that generates high-performance LoRA weights for neural networks, enabling task-specific adaptation during fine-tuning in vision and NLP domains.
Contribution
The paper presents a novel conditional latent diffusion approach for generating high-quality LoRA parameters, advancing controllable and efficient neural network adaptation.
Findings
Successfully generates task-specific high-performance parameters
Parameter distribution differs from traditional optimization, indicating generalization
Effective in both vision and NLP tasks
Abstract
Generative models have achieved remarkable success in image, video, and text domains. Inspired by this, researchers have explored utilizing generative models to generate neural network parameters. However, these efforts have been limited by the parameter size and the practicality of generating high-performance parameters. In this paper, we propose COND P-DIFF, a novel approach that demonstrates the feasibility of controllable high-performance parameter generation, particularly for LoRA (Low-Rank Adaptation) weights, during the fine-tuning process. Specifically, we employ an autoencoder to extract efficient latent representations for parameters. We then train a conditional latent diffusion model to synthesize high-performing model parameters from random noise based on specific task conditions. Experimental results in both computer vision and natural language processing domains…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
1. The presented qualitative results are on par or slightly better than baselines.
1. The paper seems very rushed and incomplete. Figures are missing even though there are references in the text. As a consequence, no visual results are provided, nor other analysis claimed in the paper. 2. The novelty is very limited. The authors build on top of P-Diff by adding only a task conditioning mechanism. 3. There are numerous editorial errors, e.g. lack of spaces (lines 061, 222, 292), wrong formatting (eq. 6b, 7), grammatical errors (270-271), Fig vs Figure, etc.
1. Originality: The idea of using conditional diffusion models to synthesize LoRA parameters for fine-tuning is novel, providing an innovative method for model adaptation. To the best of my knowledge, this is the first paper that bridges parameter-efficient tuning methods and conditional diffusion parameter generation. 2. Quality: The method is well-defined, with experiments that confirm the approach’s effectiveness. Comparisons with traditional fine-tuning and model-averaging techniques are inc
1. Some figures (e.g., result comparisons and t-SNE analyses of parameter distributions) are missing, limiting the reader’s ability to evaluate the quality and diversity of generated parameters visually. The appendix is also missing. 2. While the paper mentions using several datasets (e.g., GLUE, SemArt, WikiArt), specific citations and descriptions are absent. This omission makes it challenging for readers to understand the dataset characteristics and reproducibility. Additionally, results on t
The idea of generating LoRA parameters with a diffusion model is quite interesting. Overall, the paper is well written and clearly explained. The experiments and results section is comprehensive with results on various tasks.
1. One major concern is the absence of the corresponding figures in Section 4.4 (Analysis). It is not possible to make any inferences about the experiments without the referenced figures. And I think Section 4.4 is important to understand whether the diffusion model is learning anything new compared to the original LoRA weights. 2. One important baseline that would be useful in this setup is to use randomly select one of the training checkpoint and analyse it for a new test image (in case of sty
1. The authors propose a new approach using a latent diffusion model to synthesize LoRA weights for pretrained models. 2. The authors gather a dataset of LoRA weights tailored to specific pretrained models, using it to train the latent diffusion model. 3. Experiments on both language and image tasks show that this method can synthesizes LoRA weights for pretrained models, achieving good performance.
1. The dataset size is small, which risks the model memorizing the data. While the authors show that the synthesized parameters differ from the training data, the diffusion process may only add Gaussian noise to the parameters training dataset. Additionally, the authors' use of averaging synthesized weights for specific tasks may eliminate Gaussian noise effects in diffusion process. They should present performance results without averaging. 2. The claim that the proposed method generalizes wel
* The paper goes beyond shallow architectures, unlike existing works in the area, and presents results with network architectures that are widely used. Moreover, it uses pre-trained networks and studies parameter generation for LoRAs, which makes it practically compelling. * The overall methodology is simple, and the components involved have been studied well in the paper and in the literature. * Studies two different modalities (language and images) that make it appealing.
* Even though the paper studies two modalities, the number of tasks studied seems limited as far as practicality is concerned, which is a central theme of this work. For the image domain, it studies only image stylization but misses out on other potentially more practical tasks that may require more samples as conditions (image classification, for example). * Comparisons between LoRA fine-tuning time and LoRA parameter generation time (including the COND P-DIFF training time) seem to be missing
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTarget Tracking and Data Fusion in Sensor Networks
MethodsLatent Diffusion Model · Diffusion
