Copyright Detective: A Forensic System to Evidence LLMs Flickering Copyright Leakage Risks
Guangwei Zhang, Jianing Zhu, Cheng Qian, Neil Gong, Rada Mihalcea, Zhaozhuo Xu, Jingrui He, Jiaqi Ma, Yun Huang, Chaowei Xiao, Bo Li, Ahmed Abbasi, Dongwon Lee, Heng Ji, Denghui Zhang

TL;DR
Copyright Detective is an interactive forensic system designed to detect, analyze, and visualize copyright risks in LLM outputs by integrating multiple detection methods within a unified framework for responsible AI deployment.
Contribution
It introduces the first comprehensive forensic system for copyright risk assessment in LLMs, combining various detection paradigms and an interactive workflow for systematic auditing.
Findings
Supports responsible deployment of LLMs by identifying copyright risks.
Enables transparent evaluation of copyright leakage in black-box models.
Provides a unified framework for multiple detection techniques.
Abstract
We present Copyright Detective, the first interactive forensic system for detecting, analyzing, and visualizing potential copyright risks in LLM outputs. The system treats copyright infringement versus compliance as an evidence discovery process rather than a static classification task due to the complex nature of copyright law. It integrates multiple detection paradigms, including content recall testing, paraphrase-level similarity analysis, persuasive jailbreak probing, and unlearning verification, within a unified and extensible framework. Through interactive prompting, response collection, and iterative workflows, our system enables systematic auditing of verbatim memorization and paraphrase-level leakage, supporting responsible deployment and transparent evaluation of LLM copyright risks even with black-box access.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Authorship Attribution and Profiling · Digital and Cyber Forensics
