Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model!
Do-hyeon Yoon, Minsoo Chun, Thomas Allen, Hans M\"uller, Min Wang, and Rajesh Sharma

TL;DR
This paper proposes a robust intrinsic fingerprinting method for LLMs based on stable attention parameter distributions, effectively identifying model lineage and detecting copyright violations even after continued training.
Contribution
It introduces a novel fingerprinting approach using attention parameter distributions that remains stable post-training, aiding in model attribution and copyright protection.
Findings
Attention parameter distribution patterns are distinctive and stable after extensive training.
The method successfully identifies model lineage across different model families.
Evidence of model plagiarism and copyright violation was uncovered in a case study.
Abstract
Large language models (LLMs) face significant copyright and intellectual property challenges as the cost of training increases and model reuse becomes prevalent. While watermarking techniques have been proposed to protect model ownership, they may not be robust to continue training and development, posing serious threats to model attribution and copyright protection. This work introduces a simple yet effective approach for robust LLM fingerprinting based on intrinsic model characteristics. We discover that the standard deviation distributions of attention parameter matrices across different layers exhibit distinctive patterns that remain stable even after extensive continued training. These parameter distribution signatures serve as robust fingerprints that can reliably identify model lineage and detect potential copyright infringement. Our experimental validation across multiple model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
