ICU-Bench:Benchmarking Continual Unlearning in Multimodal Large Language Models
Yuhang Wang, Wenjie Mei, Junkai Zhang, Guangyu He, Zhenxing Niu, Haichang Gao

TL;DR
ICU-Bench is a new benchmark for evaluating continual unlearning in multimodal large language models, focusing on privacy-sensitive data and highlighting current methods' limitations in real-world scenarios.
Contribution
The paper introduces ICU-Bench, a comprehensive continual unlearning benchmark with new metrics and extensive experiments revealing the challenges in existing methods.
Findings
Existing unlearning methods struggle in continual settings.
Current methods have limitations in balancing forgetting, utility, and scalability.
ICU-Bench enables detailed analysis of continual unlearning performance.
Abstract
Although Multimodal Large Language Models (MLLMs) have achieved remarkable progress across many domains, their training on large-scale multimodal datasets raises serious privacy concerns, making effective machine unlearning increasingly necessary. However, existing benchmarks mainly focus on static or short-sequence settings, offering limited support for evaluating continual privacy deletion requests in realistic deployments. To bridge this gap, we introduce ICU-Bench, a continual multimodal unlearning benchmark built on privacy-critical document data. ICU-Bench contains 1,000 privacy-sensitive profiles from two document domains, medical reports and labor contracts, with 9,500 images, 16,000 question-answer pairs, and 100 forget tasks. Additionally, new continual unlearning metrics are introduced, facilitating a comprehensive analysis of forgetting effectiveness, historical forgetting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
