ISACL: Internal State Analyzer for Copyrighted Training Data Leakage

Guangwei Zhang; Qisheng Su; Jiateng Liu; Cheng Qian; Yanzhou Pan; Yanjie Fu; Denghui Zhang

arXiv:2508.17767·cs.CL·September 16, 2025

ISACL: Internal State Analyzer for Copyrighted Training Data Leakage

Guangwei Zhang, Qisheng Su, Jiateng Liu, Cheng Qian, Yanzhou Pan, Yanjie Fu, Denghui Zhang

PDF

1 Video

TL;DR

This paper proposes a proactive method to detect potential copyrighted data leaks in Large Language Models by analyzing their internal states before text generation, improving privacy and compliance.

Contribution

It introduces a neural network classifier that examines LLM internal states to identify risks of data leakage prior to output generation, a novel preventative approach.

Findings

01

Internal state analysis effectively detects potential data leaks.

02

The method reduces copyright infringement risks during text generation.

03

Scalable solution integrated with RAG systems enhances data privacy.

Abstract

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but pose risks of inadvertently exposing copyrighted or proprietary data, especially when such data is used for training but not intended for distribution. Traditional methods address these leaks only after content is generated, which can lead to the exposure of sensitive information. This study introduces a proactive approach: examining LLMs' internal states before text generation to detect potential leaks. By using a curated dataset of copyrighted materials, we trained a neural network classifier to identify risks, allowing for early intervention by stopping the generation process or altering outputs to prevent disclosure. Integrated with a Retrieval-Augmented Generation (RAG) system, this framework ensures adherence to copyright and licensing requirements while enhancing data privacy and ethical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

ISACL: Internal State Analyzer for Copyrighted Training Data Leakage· underline