A Fingerprint for Large Language Models
Zhiguang Yang, Hanzhou Wu

TL;DR
This paper introduces a black-box fingerprinting method for large language models that can verify model ownership and detect modifications, offering an efficient and robust solution for protecting LLMs against infringement.
Contribution
The authors propose a novel black-box fingerprinting technique that models LLM outputs in a unique vector space and enables infringement detection and fine-tuning modification identification.
Findings
High accuracy in fingerprint verification
Robustness against parameter-efficient fine-tuning attacks
Efficient detection of model infringement
Abstract
Recent advances confirm that large language models (LLMs) can achieve state-of-the-art performance across various tasks. However, due to the resource-intensive nature of training LLMs from scratch, it is urgent and crucial to protect the intellectual property of LLMs against infringement. This has motivated the authors in this paper to propose a novel black-box fingerprinting technique for LLMs. We firstly demonstrate that the outputs of LLMs span a unique vector space associated with each model. We model the problem of fingerprint authentication as the task of evaluating the similarity between the space of the victim model and the space of the suspect model. To tackle with this problem, we introduce two solutions: the first determines whether suspect outputs lie within the victim's subspace, enabling fast infringement detection; the second reconstructs a joint subspace to detect models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
