Provable unlearning in topic modeling and downstream tasks

Stanley Wei; Sadhika Malladi; Sanjeev Arora; Amartya Sanyal

arXiv:2411.12600·cs.LG·April 22, 2025

Provable unlearning in topic modeling and downstream tasks

Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal

PDF

Open Access 1 Video 3 Reviews

TL;DR

This paper introduces the first provable unlearning algorithms for topic models, enabling efficient removal of data with guarantees, applicable to downstream tasks like retrieval and classification.

Contribution

It provides a novel, computationally efficient unlearning method for topic models with formal guarantees, including after fine-tuning for downstream tasks.

Findings

01

Unlearning algorithm has computational overhead independent of dataset size.

02

Model's deletion capacity is quantified, showing how much data can be unlearned.

03

Unlearning is easier after fine-tuning, without modifying the base model.

Abstract

Machine unlearning algorithms are increasingly important as legal concerns arise around the provenance of training data, but verifying the success of unlearning is often difficult. Provable guarantees for unlearning are often limited to supervised learning settings. In this paper, we provide the first theoretical guarantees for unlearning in the pre-training and fine-tuning paradigm by studying topic models, simple bag-of-words language models that can be adapted to solve downstream tasks like retrieval and classification. First, we design a provably effective unlearning algorithm for topic models that incurs a computational overhead independent of the size of the original dataset. Our analysis additionally quantifies the deletion capacity of the model -- i.e., the number of examples that can be unlearned without incurring a significant cost in model performance. Finally, we formally…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 6Confidence 4

Strengths

The authors developed unlearning algorithms with provable guarantees for topic models and its downstream tasks. This can be viewed as the first theoretical guarantees for unlearning in the pretraining and finetuning paradigm. Also, the proposed unlearning algorithms can be implemented efficiently, with runtime independent of the dataset size.

Weaknesses

1. It seems that the proposed algorithms are heavily related to the learning algorithm developed by Arora et al. 2012. Can authors generalize their results to other learning algorithms? 2. Since this is the first unlearning algorithm in the pretraining and finetuning paradigm that enjoys provable guarantees, can authors highlight their technical contributions in deriving the results? What are the main breakthroughs compared to developing provable unlearning algorithms in the supervised setting?

Reviewer 02Rating 8Confidence 3

Strengths

- Serious theoretical effort to give provable unlearning guarantees in a simplified setting. An important step towards understanding unlearning for more complicated pretrain/finetune setups. - Builds on the provable topic modeling results of Arora et al. (2012) to give much stronger unlearning guarantees than other work on provable unlearning. - The main results are clearly explained and contextualized, especially the difference between unlearning for pretrained models and unlearning for fine-

Weaknesses

- The presentation is a bit hard to follow and relies heavily on many results from Arora et al. (2012). - $(\epsilon, \delta)$-unlearning is mentioned but not defined in L85--86. It would be helpful for those unfamiliar w/ the formalization of unlearning to at least sketch the details here so the reader can understand the type of guarantee that the paper is trying to give. - The need for a utility-preserving unlearning definition is not well-motivated. In Definition 4, it seems intuitive th

Reviewer 03Rating 6Confidence 2

Strengths

* The paper gives solid theoretical results.

Weaknesses

Unlearning algorithms for language models have practical applications, as the authors argued in the introduction of the paper. However, it is an indeed a limitation — as the authors rightly point out — that the analysis is on topic models, which are not state of the art at this time. Further — I am saying this without citations — Gaussian mechanism is also not practical as it introduces too much noise to make the output useful.

Videos

Provable unlearning in topic modeling and downstream tasks· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsBalanced Selection