Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models

JuneHyoung Kwon; JungMin Yun; YoungBin Kim

arXiv:2605.03547·cs.CV·May 6, 2026

Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models

JuneHyoung Kwon, JungMin Yun, YoungBin Kim

PDF

1 Datasets

TL;DR

This paper introduces CoVUBench, a benchmark for evaluating how well large vision-language models can unlearn copyrighted visual content, addressing a critical challenge in responsible AI deployment.

Contribution

It presents the first systematic framework and evaluation protocol for measuring copyright content unlearning effectiveness in multimodal large models.

Findings

01

CoVUBench enables robust assessment of unlearning in LVLMs.

02

The benchmark balances copyright content removal with model utility preservation.

03

Procedurally generated synthetic data ensures realistic evaluation scenarios.

Abstract

Large Vision-Language Models (LVLMs), trained on web-scale data, risk memorizing and regenerating copyrighted visual content such as characters and logos, creating significant challenges. Machine unlearning offers a path to mitigate these risks by removing specific content post-training, but evaluating its effectiveness, especially in the complex multimodal setting of LVLMs, remains an open problem. Current evaluation methods often lack robustness or fail to capture the nuances of cross-modal concept erasure. To address this critical gap, we introduce the CoVUBench benchmark, the first framework specifically designed for evaluating copyright content unlearning in LVLMs. CoVUBench utilizes procedurally generated, legally safe synthetic data coupled with systematic visual variations spanning compositional changes and diverse domain manifestations to ensure realistic and robust evaluation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

herbwood27/CoVUBench
dataset· 109 dl
109 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.