Data Unlearning Beyond Uniform Forgetting via Diffusion Time and Frequency Selection

Jinseong Park; Mijung Park

arXiv:2510.17917·cs.LG·October 22, 2025

Data Unlearning Beyond Uniform Forgetting via Diffusion Time and Frequency Selection

Jinseong Park, Mijung Park

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a novel approach to data unlearning in diffusion models by focusing on selective time-frequency ranges, improving unlearning quality and sample aesthetics without full retraining.

Contribution

It proposes a time-frequency selective method for data unlearning in diffusion models, addressing quality degradation and incomplete forgetting issues in prior methods.

Findings

01

Improved generation quality with selective unlearning

02

Effective unlearning across diverse tasks and objectives

03

Enhanced evaluation metrics for unlearning performance

Abstract

Data unlearning aims to remove the influence of specific training samples from a trained model without requiring full retraining. Unlike concept unlearning, data unlearning in diffusion models remains underexplored and often suffers from quality degradation or incomplete forgetting. To address this, we first observe that most existing methods attempt to unlearn the samples at all diffusion time steps equally, leading to poor-quality generation. We argue that forgetting occurs disproportionately across time and frequency, depending on the model and scenarios. By selectively focusing on specific time-frequency ranges during training, we achieve samples with higher aesthetic quality and lower noise. We validate this improvement by applying our time-frequency selective approach to diverse settings, including gradient-based and preference optimization objectives, as well as both image-level…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 6Confidence 3

Strengths

- The paper provides empirical investigation across multiple dimensions. The toy experiment on two half-moons (Figure 2) clearly illustrates how different timestep ranges affect unlearning. The gradient norm analysis (Figure 4b) provides concrete evidence that middle-to-late timesteps are most affected by unlearning. The power spectral density analysis (Figures 5-6) offers compelling evidence for the frequency filtering hypothesis. These analyses are well-grounded in prior work on diffusion mode

Weaknesses

- While the appendix mentions searching over discrete time ranges and frequency cutoffs $r_t$, this still requires manual grid search for each new scenario. The paper shows that CelebA-HQ works best with [250, 750] while Stable Diffusion requires [750, 1000], but provides no systematic approach for determining these ranges a priori beyond trial and error. Without principled guidelines or even heuristics, practitioners must conduct extensive grid search for each new task, limiting practical appli

Reviewer 02Rating 4Confidence 4

Strengths

- A good deal of quantitative and qualitative empirical evidence is provided to justify claims that the middle range of timesteps are most effective to target and that awkward artifacts in the samples of unlearned images using existing methods are primarily the result of high frequency components being targeted. - Analysis of the performance of preference optimization methods such as DPO and KTO are provided as well and appear to be quite effective.

Weaknesses

- I would say the primary weakness of the paper is why the quality of the unlearned samples matters if they have already been successfully unlearned - Section 3 is missing some details - see questions

Reviewer 03Rating 4Confidence 4

Strengths

1. The proposed time–frequency selective unlearning mechanism introduces a simple yet effective improvement over baseline approaches. This method focuses on mid-to-late timesteps and further constrains unlearning to low-frequency components through an FFT-based low-pass filter. This selective design allows the model to remove semantic information while preserving high-frequency texture details. 2. This paper introduces SSCDnorm to provide a more reliable measurement of unlearning effectiveness.

Weaknesses

1. The time window and frequency cutoff are chosen empirically without adaptive tuning or robustness analysis. This makes the method less stable and difficult to generalize across different models and tasks. It is recommended that the authors conduct further experiment to explore how different parameter settings affect the unlearning performance and model stability. 2. The paper lacks systematic ablation studies, and the image-level experiment [l375] only removes six faces, which is too few to

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Cell Image Analysis Techniques