Black-Box Forgetting
Yusuke Kuwana, Yuta Goto, Takashi Shibata, Go Irie

TL;DR
This paper introduces a novel method for selectively forgetting specific classes in large black-box pre-trained models by optimizing input prompts without model access, improving operational efficiency.
Contribution
It proposes a derivative-free prompt optimization approach with latent context sharing for black-box models, addressing the challenge of selective forgetting without model details.
Findings
Outperforms baseline methods on benchmark datasets
Effective in reducing recognition of specified classes
Maintains accuracy on other classes
Abstract
Large-scale pre-trained models (PTMs) provide remarkable zero-shot classification capability covering a wide variety of object classes. However, practical applications do not always require the classification of all kinds of objects, and leaving the model capable of recognizing unnecessary classes not only degrades overall accuracy but also leads to operational disadvantages. To mitigate this issue, we explore the selective forgetting problem for PTMs, where the task is to make the model unable to recognize only the specified classes while maintaining accuracy for the rest. All the existing methods assume "white-box" settings, where model information such as architectures, parameters, and gradients is available for training. However, PTMs are often "black-box," where information on such models is unavailable for commercial reasons or social responsibilities. In this paper, we address a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEducator Training and Historical Pedagogy · Oral History, Memory, Narrative Analysis
