Conditional LoRA Parameter Generation

Xiaolong Jin; Kai Wang; Dongwen Tang; Wangbo Zhao; Yukun Zhou; Junshu; Tang; Yang You

arXiv:2408.01415·cs.AI·August 5, 2024

Conditional LoRA Parameter Generation

Xiaolong Jin, Kai Wang, Dongwen Tang, Wangbo Zhao, Yukun Zhou, Junshu, Tang, Yang You

PDF

Open Access 5 Reviews

TL;DR

This paper introduces COND P-DIFF, a conditional diffusion model that generates high-performance LoRA weights for neural networks, enabling task-specific adaptation during fine-tuning in vision and NLP domains.

Contribution

The paper presents a novel conditional latent diffusion approach for generating high-quality LoRA parameters, advancing controllable and efficient neural network adaptation.

Findings

01

Successfully generates task-specific high-performance parameters

02

Parameter distribution differs from traditional optimization, indicating generalization

03

Effective in both vision and NLP tasks

Abstract

Generative models have achieved remarkable success in image, video, and text domains. Inspired by this, researchers have explored utilizing generative models to generate neural network parameters. However, these efforts have been limited by the parameter size and the practicality of generating high-performance parameters. In this paper, we propose COND P-DIFF, a novel approach that demonstrates the feasibility of controllable high-performance parameter generation, particularly for LoRA (Low-Rank Adaptation) weights, during the fine-tuning process. Specifically, we employ an autoencoder to extract efficient latent representations for parameters. We then train a conditional latent diffusion model to synthesize high-performing model parameters from random noise based on specific task conditions. Experimental results in both computer vision and natural language processing domains…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 1Confidence 5

Strengths

1. The presented qualitative results are on par or slightly better than baselines.

Weaknesses

1. The paper seems very rushed and incomplete. Figures are missing even though there are references in the text. As a consequence, no visual results are provided, nor other analysis claimed in the paper. 2. The novelty is very limited. The authors build on top of P-Diff by adding only a task conditioning mechanism. 3. There are numerous editorial errors, e.g. lack of spaces (lines 061, 222, 292), wrong formatting (eq. 6b, 7), grammatical errors (270-271), Fig vs Figure, etc.

Reviewer 02Rating 5Confidence 3

Strengths

1. Originality: The idea of using conditional diffusion models to synthesize LoRA parameters for fine-tuning is novel, providing an innovative method for model adaptation. To the best of my knowledge, this is the first paper that bridges parameter-efficient tuning methods and conditional diffusion parameter generation. 2. Quality: The method is well-defined, with experiments that confirm the approach’s effectiveness. Comparisons with traditional fine-tuning and model-averaging techniques are inc

Weaknesses

1. Some figures (e.g., result comparisons and t-SNE analyses of parameter distributions) are missing, limiting the reader’s ability to evaluate the quality and diversity of generated parameters visually. The appendix is also missing. 2. While the paper mentions using several datasets (e.g., GLUE, SemArt, WikiArt), specific citations and descriptions are absent. This omission makes it challenging for readers to understand the dataset characteristics and reproducibility. Additionally, results on t

Reviewer 03Rating 3Confidence 4

Strengths

The idea of generating LoRA parameters with a diffusion model is quite interesting. Overall, the paper is well written and clearly explained. The experiments and results section is comprehensive with results on various tasks.

Weaknesses

1. One major concern is the absence of the corresponding figures in Section 4.4 (Analysis). It is not possible to make any inferences about the experiments without the referenced figures. And I think Section 4.4 is important to understand whether the diffusion model is learning anything new compared to the original LoRA weights. 2. One important baseline that would be useful in this setup is to use randomly select one of the training checkpoint and analyse it for a new test image (in case of sty

Reviewer 04Rating 3Confidence 4

Strengths

1. The authors propose a new approach using a latent diffusion model to synthesize LoRA weights for pretrained models. 2. The authors gather a dataset of LoRA weights tailored to specific pretrained models, using it to train the latent diffusion model. 3. Experiments on both language and image tasks show that this method can synthesizes LoRA weights for pretrained models, achieving good performance.

Weaknesses

1. The dataset size is small, which risks the model memorizing the data. While the authors show that the synthesized parameters differ from the training data, the diffusion process may only add Gaussian noise to the parameters training dataset. Additionally, the authors' use of averaging synthesized weights for specific tasks may eliminate Gaussian noise effects in diffusion process. They should present performance results without averaging. 2. The claim that the proposed method generalizes wel

Reviewer 05Rating 5Confidence 4

Strengths

* The paper goes beyond shallow architectures, unlike existing works in the area, and presents results with network architectures that are widely used. Moreover, it uses pre-trained networks and studies parameter generation for LoRAs, which makes it practically compelling. * The overall methodology is simple, and the components involved have been studied well in the paper and in the literature. * Studies two different modalities (language and images) that make it appealing.

Weaknesses

* Even though the paper studies two modalities, the number of tasks studied seems limited as far as practicality is concerned, which is a central theme of this work. For the image domain, it studies only image stylization but misses out on other potentially more practical tasks that may require more samples as conditions (image classification, for example). * Comparisons between LoRA fine-tuning time and LoRA parameter generation time (including the COND P-DIFF training time) seem to be missing

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTarget Tracking and Data Fusion in Sensor Networks

MethodsLatent Diffusion Model · Diffusion