GUIDE: Guidance-based Incremental Learning with Diffusion Models
Bartosz Cywi\'nski, Kamil Deja, Tomasz Trzci\'nski, Bart{\l}omiej, Twardowski, {\L}ukasz Kuci\'nski

TL;DR
GUIDE is a continual learning method that uses classifier-guided diffusion models to generate targeted rehearsal samples, effectively reducing catastrophic forgetting and outperforming existing generative replay techniques.
Contribution
It introduces a guidance-based sampling strategy in diffusion models for rehearsal, bridging the gap between generative and buffer-based continual learning methods.
Findings
GUIDE significantly reduces catastrophic forgetting.
Outperforms state-of-the-art continual learning methods.
Generates more relevant rehearsal samples for past tasks.
Abstract
We introduce GUIDE, a novel continual learning approach that directs diffusion models to rehearse samples at risk of being forgotten. Existing generative strategies combat catastrophic forgetting by randomly sampling rehearsal examples from a generative model. Such an approach contradicts buffer-based approaches where sampling strategy plays an important role. We propose to bridge this gap by incorporating classifier guidance into the diffusion process to produce rehearsal examples specifically targeting information forgotten by a continuously trained model. This approach enables the generation of samples from preceding task distributions, which are more likely to be misclassified in the context of recently encountered classes. Our experimental results show that GUIDE significantly reduces catastrophic forgetting, outperforming conventional random sampling approaches and surpassing…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
1) Generating examples in a specific way (i.e., on the decision boundary) to relieve forgetting is novel and compelling 2) leveraging the past classifier as a guidance offers interesting insights 3) the ablation studies outline the impact of different components effectively
1) Experimental results: - I did not find the experimental setting clear (e.g., lacking the number of epochs/iterations for each setting) - Results reported in Table 1 are suspiciously low (w.r.t. to the results reported by competitors such as DDGR). For instance, DDGR reports an accuracy that is almost **four** times higher on CIFAR-100, 10 tasks. - Results in Table 2 are misleading. All methodologies reported as competitors (except for GSS) were introduced for a multi-epoch setup: evaluatin
The paper provide good intuition of their approach, showcasing it using different qualitative and 2D projection based examples, as to how different techniques sample the space. I also like the hyperparameter sensitivity performed in order to show robustness, and different possible variant of GUIDE. While I've not been in touch with Continual learning community for a while, I think these bare minimum things are necessary.
In terms of weakness, I've a few questions - The sampling procedure as outlined in Alg2 in the appendix uses the model $f_{\phi}$ which may or may not be trained well enough in order to be able to understand which class is the image from the previous task (as generated by the diffusion model during the sampling steps) belong to in the current set of classes in the task. Moreover, shouldn't it affect the classifier guidance as well when sampling from the previous task's class? This conceptually
1. This paper is well-written and easy to follow. 2. I appreciate the extensive experiments.
My main concern is about the technique contribution of this paper. 1. The main technique of this paper is eq.1 and eq.2, which comes from [1]. In the view of technique, this paper is not novel. This means that this paper is an (a+b)-like work which merges the technique in [1], diffusion model and continual learning setting. Therefore, i think the lack of technique contribution is the main weakness of this paper. 2. (a+b)-like work is ok, but this paper doesn't show enough motivations about w
1. The paper introduces the idea of using information from the current task in generating replay samples to generative replay CL. 2. Compared to diffusion-based generative replay CL, the proposed method seems to be a more effective way to utilize the diffusion model.
1. The technical novelty and contribution is limited. The guidance of diffusion model generation follows standard procedures and the high-level idea follows the memory replay CL approach of RAR. 2. The presentation can be improved. Many figures and tables are extremely small (e.g., table 1 and figure 2). Also, eq 6 is confusing. Based on the definition of y_i in eq 7, the y_{i-1} in eq 6 should only refer to a certain class of task i-1, which I don't think is the intention of the author (correc
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Timetabling Solutions
MethodsDiffusion
