AutoLoRA: AutoGuidance Meets Low-Rank Adaptation for Diffusion Models

Artur Kasymov; Marcin Sendera; Micha{\l} Stypu{\l}kowski; Maciej; Zi\k{e}ba; Przemys{\l}aw Spurek

arXiv:2410.03941·cs.CV·October 8, 2024

AutoLoRA: AutoGuidance Meets Low-Rank Adaptation for Diffusion Models

Artur Kasymov, Marcin Sendera, Micha{\l} Stypu{\l}kowski, Maciej, Zi\k{e}ba, Przemys{\l}aw Spurek

PDF

Open Access 1 Repo 4 Reviews

TL;DR

AutoLoRA introduces an innovative guidance method for LoRA-fine-tuned diffusion models, enhancing sample diversity and quality by balancing domain consistency and variability, outperforming existing techniques.

Contribution

The paper proposes AutoLoRA, a novel guidance approach that improves diversity and quality in LoRA fine-tuned diffusion models, addressing limitations of context bias and low variability.

Findings

01

AutoLoRA outperforms existing guidance methods on multiple metrics.

02

Incorporating classifier-free guidance enhances diversity and quality.

03

AutoLoRA effectively balances domain consistency with sample variability.

Abstract

Low-rank adaptation (LoRA) is a fine-tuning technique that can be applied to conditional generative diffusion models. LoRA utilizes a small number of context examples to adapt the model to a specific domain, character, style, or concept. However, due to the limited data utilized during training, the fine-tuned model performance is often characterized by strong context bias and a low degree of variability in the generated images. To solve this issue, we introduce AutoLoRA, a novel guidance technique for diffusion models fine-tuned with the LoRA approach. Inspired by other guidance techniques, AutoLoRA searches for a trade-off between consistency in the domain represented by LoRA weights and sample diversity from the base conditional diffusion model. Moreover, we show that incorporating classifier-free guidance for both LoRA fine-tuned and base models leads to generating samples with…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 1Confidence 4

Strengths

The core idea is to provide a sampling procedure with the trade-off between exploring the directions with the base conditional diffusion model consistent with the path determined by the LoRA finetuned version, similar to classifier-free guidance.

Weaknesses

I carefully reviewed the paper, and rather than addressing an extensive list of less critical and debatable issues, I will concentrate on the most significant concerns that influence my rating. My focus will be on the following three points: 1) The novelty is very marginal. The idea is a straightforward extension of AutoGuidance, yet instead of conditional and unconditional, here, the guidance is based on the base and LoRA finetuned models. 2) The diversity improvement is questionable and ins

Reviewer 02Rating 5Confidence 4

Strengths

1. The paper is well written and easy to follow. 2. This paper presents a novel guidance technique on finetune-based diffusion models, which combines LoRA with an autoguidance strategy to improve the quality and diversity of generated samples. 3. This paper propose some reasonable metrics to evaluate the diversity and consistency of generated samples, including Diversity, Character Presence Score (CPS), Prompt Corresepondence (PC) and Style Adherence (SA).

Weaknesses

1. It appears that the paper lacks a theoretical analysis explaining the design of AutoLoRA and why it is effective in balancing the diversity of the original pre-trained model with the consistency with LoRA-tuned new model. Providing a more in-depth theoretical justification for AutoLoRA would help to differentiate it with Autoguidance. 2. It looks that the paper also lacks some comaprasion experiments with other fine-tuning diffusion models, such as DreamBooth, Textural Inversion. Furthermore

Reviewer 03Rating 3Confidence 3

Strengths

1. Originality: The paper introduces a unique combination of AutoGuidance and LoRA for diffusion models, expanding on traditional guidance approaches. The new inference scheme can achieve a higher generated image diversity while maintaining prompt correspondence. 2. Quality: The study is methodologically sound, detailing the AutoLoRA algorithm and presenting clear mathematical formulations. The experiments include comparisons across different diffusion models and LoRA modules, and they explore h

Weaknesses

1. There are some passages affecting the fluency of reading the paper. For example, the AutoGuidance in the Introduction shall be discussed after introducing the problem the authors want to solve instead of at the very beginning. The related work and preliminaries are too detailed, decreasing the importance of the proposed method. 2. The quantitative improvements provided by AutoLoRA over CFG (e.g., in Div-CPS and Div-SA) are not substantial across all experimental setups. The experiments are ce

Reviewer 04Rating 3Confidence 3

Strengths

The proposed method is simple, easy to implement, and practical for the community.

Weaknesses

(1) The motivation is unconvincing. The authors state in Lines 44–45 that LoRA reduces the diversity of generated images. However, Table 1 shows that LoRA demonstrates higher diversity across different LoRA scales compared to the proposed AutoLoRA. I recommend that the authors revise the introduction to ensure consistency in their arguments. The proposed method appears trivial, as it is merely a straightforward combination of LoRA and AutoGuidance. I recommend that the authors provide further ju

Code & Models

Repositories

gmum/AutoLoRA
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare

MethodsBalanced Selection · Diffusion