GUIDE: Guidance-based Incremental Learning with Diffusion Models

Bartosz Cywi\'nski; Kamil Deja; Tomasz Trzci\'nski; Bart{\l}omiej; Twardowski; {\L}ukasz Kuci\'nski

arXiv:2403.03938·cs.LG·June 3, 2024·1 cites

GUIDE: Guidance-based Incremental Learning with Diffusion Models

Bartosz Cywi\'nski, Kamil Deja, Tomasz Trzci\'nski, Bart{\l}omiej, Twardowski, {\L}ukasz Kuci\'nski

PDF

Open Access 1 Repo 4 Reviews

TL;DR

GUIDE is a continual learning method that uses classifier-guided diffusion models to generate targeted rehearsal samples, effectively reducing catastrophic forgetting and outperforming existing generative replay techniques.

Contribution

It introduces a guidance-based sampling strategy in diffusion models for rehearsal, bridging the gap between generative and buffer-based continual learning methods.

Findings

01

GUIDE significantly reduces catastrophic forgetting.

02

Outperforms state-of-the-art continual learning methods.

03

Generates more relevant rehearsal samples for past tasks.

Abstract

We introduce GUIDE, a novel continual learning approach that directs diffusion models to rehearse samples at risk of being forgotten. Existing generative strategies combat catastrophic forgetting by randomly sampling rehearsal examples from a generative model. Such an approach contradicts buffer-based approaches where sampling strategy plays an important role. We propose to bridge this gap by incorporating classifier guidance into the diffusion process to produce rehearsal examples specifically targeting information forgotten by a continuously trained model. This approach enables the generation of samples from preceding task distributions, which are more likely to be misclassified in the context of recently encountered classes. Our experimental results show that GUIDE significantly reduces catastrophic forgetting, outperforming conventional random sampling approaches and surpassing…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 3Confidence 4

Strengths

1) Generating examples in a specific way (i.e., on the decision boundary) to relieve forgetting is novel and compelling 2) leveraging the past classifier as a guidance offers interesting insights 3) the ablation studies outline the impact of different components effectively

Weaknesses

1) Experimental results: - I did not find the experimental setting clear (e.g., lacking the number of epochs/iterations for each setting) - Results reported in Table 1 are suspiciously low (w.r.t. to the results reported by competitors such as DDGR). For instance, DDGR reports an accuracy that is almost **four** times higher on CIFAR-100, 10 tasks. - Results in Table 2 are misleading. All methodologies reported as competitors (except for GSS) were introduced for a multi-epoch setup: evaluatin

Reviewer 02Rating 5Confidence 3

Strengths

The paper provide good intuition of their approach, showcasing it using different qualitative and 2D projection based examples, as to how different techniques sample the space. I also like the hyperparameter sensitivity performed in order to show robustness, and different possible variant of GUIDE. While I've not been in touch with Continual learning community for a while, I think these bare minimum things are necessary.

Weaknesses

In terms of weakness, I've a few questions - The sampling procedure as outlined in Alg2 in the appendix uses the model $f_{\phi}$ which may or may not be trained well enough in order to be able to understand which class is the image from the previous task (as generated by the diffusion model during the sampling steps) belong to in the current set of classes in the task. Moreover, shouldn't it affect the classifier guidance as well when sampling from the previous task's class? This conceptually

Reviewer 03Rating 5Confidence 4

Strengths

1. This paper is well-written and easy to follow. 2. I appreciate the extensive experiments.

Weaknesses

My main concern is about the technique contribution of this paper. 1. The main technique of this paper is eq.1 and eq.2, which comes from [1]. In the view of technique, this paper is not novel. This means that this paper is an (a+b)-like work which merges the technique in [1], diffusion model and continual learning setting. Therefore, i think the lack of technique contribution is the main weakness of this paper. 2. (a+b)-like work is ok, but this paper doesn't show enough motivations about w

Reviewer 04Rating 5Confidence 5

Strengths

1. The paper introduces the idea of using information from the current task in generating replay samples to generative replay CL. 2. Compared to diffusion-based generative replay CL, the proposed method seems to be a more effective way to utilize the diffusion model.

Weaknesses

1. The technical novelty and contribution is limited. The guidance of diffusion model generation follows standard procedures and the high-level idea follows the memory replay CL approach of RAR. 2. The presentation can be improved. Many figures and tables are extremely small (e.g., table 1 and figure 2). Also, eq 6 is confusing. Based on the definition of y_i in eq 7, the y_{i-1} in eq 6 should only refer to a certain class of task i-1, which I don't think is the intention of the author (correc

Code & Models

Repositories

cywinski/guide
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScheduling and Timetabling Solutions

MethodsDiffusion