UnGuide: Learning to Forget with LoRA-Guided Diffusion Models

Agnieszka Polowczyk; Alicja Polowczyk; Dawid Malarz; Artur Kasymov; Marcin Mazur; Jacek Tabor; Przemys{\l}aw Spurek

arXiv:2508.05755·cs.CV·August 11, 2025

UnGuide: Learning to Forget with LoRA-Guided Diffusion Models

Agnieszka Polowczyk, Alicja Polowczyk, Dawid Malarz, Artur Kasymov, Marcin Mazur, Jacek Tabor, Przemys{\l}aw Spurek

PDF

Open Access 3 Reviews

TL;DR

UnGuide introduces a dynamic inference mechanism that enhances machine unlearning in diffusion models by precisely controlling concept removal while preserving image quality.

Contribution

It presents UnGuide, a novel method combining LoRA with classifier-free guidance to improve targeted unlearning in diffusion models.

Findings

01

Outperforms existing LoRA-based methods in concept erasure.

02

Achieves controlled removal of specific concepts without degrading image fidelity.

03

Maintains the expressive power of diffusion models during unlearning.

Abstract

Recent advances in large-scale text-to-image diffusion models have heightened concerns about their potential misuse, especially in generating harmful or misleading content. This underscores the urgent need for effective machine unlearning, i.e., removing specific knowledge or concepts from pretrained models without compromising overall performance. One possible approach is Low-Rank Adaptation (LoRA), which offers an efficient means to fine-tune models for targeted unlearning. However, LoRA often inadvertently alters unrelated content, leading to diminished image fidelity and realism. To address this limitation, we introduce UnGuide -- a novel approach which incorporates UnGuidance, a dynamic inference mechanism that leverages Classifier-Free Guidance (CFG) to exert precise control over the unlearning process. UnGuide modulates the guidance scale based on the stability of a few first…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

- The authors reframe the removal of the specific concept as LoRA-based adaptation with a linear target in the noise prediction space, rather than as architectural changes or prompt rewrites. This leads to a simple but effective unlearning framework. - The new metric is introduced to evaluate the performance of unlearning. It is referred to as the harmonic mean of effectiveness, specificity, and generality, and is meaningful and clear. - UnGuide enables controllable concept erausre across differ

Weaknesses

- The main concern is the insufficiently justified design of the loss function for unlearning. The objective pushes the noise prediction of the LoRA-guided model for the forbidden concept c toward a linear combination of the predictions of the original model, as given by eq. (4). In the article, the authors do not explain why this geometry in the noise space should be optimal for unlearning, nor why linearity (as opposed to other divergences or constraints) is appropriate. - The prompt-conditio

Reviewer 02Rating 2Confidence 3

Strengths

The work is original in its formulation of adaptive guidance for unlearning. While prior methods such as MACE rely on complex prompt or segmentation modifications, UnGuide innovatively combines a standard LoRA fine-tuning setup with a new variance-based guidance mechanism to dynamically balance between forgetting and fidelity.

Weaknesses

The paper presents an interesting direction but suffers from several issues in organization, clarity, and experimental rigor that obscure its true contributions and weaken its overall impact. * **Poor organization and unclear flow:** The paper’s structure makes it difficult to distinguish between background material and novel contributions. For instance, the description of the *Text-to-Image generation framework*—a preliminary concept—is embedded directly in the *Methodology* section, and is im

Reviewer 03Rating 2Confidence 3

Strengths

- The experimental results are strong, showing clear advantages over prior methods - The method is also much simpler than the strongest prior methods, as it does not require external segmentation components.

Weaknesses

- The paper lacks ablations comparing their method to a base LoRA approach i.e. without using un-guidance. The authors state the key component of the work is the dynamic switching between the base and LoRA models, but at least as I can see this is not ablated in the experiments. It would greatly strengthen the insights obtained from the paper if this could be included (or make it more prominent in case I missed it accidentally) - I also could not find an explicit formula for w. Is it just a bina

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Neural Networks and Applications · Machine Learning in Healthcare