Bridging the Copyright Gap: Do Large Vision-Language Models Recognize and Respect Copyrighted Content?

Naen Xu; Jinghuai Zhang; Changjiang Li; Hengyu An; Chunyi Zhou; Jun Wang; Boyu Xu; Yuyuan Li; Tianyu Du; Shouling Ji

arXiv:2512.21871·cs.CL·December 29, 2025

Bridging the Copyright Gap: Do Large Vision-Language Models Recognize and Respect Copyrighted Content?

Naen Xu, Jinghuai Zhang, Changjiang Li, Hengyu An, Chunyi Zhou, Jun Wang, Boyu Xu, Yuyuan Li, Tianyu Du, Shouling Ji

PDF

Open Access 1 Video

TL;DR

This paper evaluates how well large vision-language models recognize and respect copyrighted content, revealing significant deficiencies and proposing a tool-augmented framework to improve copyright compliance.

Contribution

It introduces a large-scale benchmark dataset and a novel defense framework to assess and enhance copyright awareness in LVLMs.

Findings

01

State-of-the-art LVLMs often fail to recognize copyrighted content.

02

The benchmark dataset includes 50,000 multimodal pairs with and without copyright notices.

03

The proposed framework reduces copyright infringement risks.

Abstract

Large vision-language models (LVLMs) have achieved remarkable advancements in multimodal reasoning tasks. However, their widespread accessibility raises critical concerns about potential copyright infringement. Will LVLMs accurately recognize and comply with copyright regulations when encountering copyrighted content (i.e., user input, retrieved documents) in the context? Failure to comply with copyright regulations may lead to serious legal and ethical consequences, particularly when LVLMs generate responses based on copyrighted materials (e.g., retrieved book experts, news reports). In this paper, we present a comprehensive evaluation of various LVLMs, examining how they handle copyrighted content -- such as book excerpts, news articles, music lyrics, and code documentation when they are presented as visual inputs. To systematically measure copyright compliance, we introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Bridging the Copyright Gap: Do Large Vision-Language Models Recognize and Respect Copyrighted Content?· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Language and cultural evolution