Instructional Fingerprinting of Large Language Models
Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao,, Muhao Chen

TL;DR
This paper introduces a lightweight fingerprinting method for large language models using instruction backdoors, enabling ownership verification without impacting model performance and resisting unauthorized claims.
Contribution
It proposes a novel, lightweight instruction tuning approach for LLM fingerprinting that maintains model performance and enhances security features.
Findings
Effective fingerprinting on 11 popular LLMs
No impact on normal model behavior
Resistant to fingerprint guessing and parameter-efficient training
Abstract
The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their license terms (e.g. restricting commercial use). In this study, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. Model publisher specifies a confidential private key and implants it as an instruction backdoor that causes the LLM to generate specific text when the key is present. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License. Code is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
