Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique

Mark Russinovich; Ahmed Salem

arXiv:2407.10887·cs.CR·June 13, 2025·1 cites

Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique

Mark Russinovich, Ahmed Salem

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Chain & Hash, a cryptographic fingerprinting method for LLMs that ensures ownership proof and robustness against modifications, addressing security concerns in model misuse and theft.

Contribution

The paper presents a novel Chain and Hash framework that cryptographically links prompts and responses, and enhances fingerprint robustness against output alterations and adversarial attacks.

Findings

01

Provides strong security for ownership proof

02

Resilient against fine-tuning and adversarial erasure

03

Applicable to fingerprinting LoRA adapters

Abstract

Growing concerns over the theft and misuse of Large Language Models (LLMs) have heightened the need for effective fingerprinting, which links a model to its original version to detect misuse. In this paper, we define five key properties for a successful fingerprint: Transparency, Efficiency, Persistence, Robustness, and Unforgeability. We introduce a novel fingerprinting framework that provides verifiable proof of ownership while maintaining fingerprint integrity. Our approach makes two main contributions. First, we propose a Chain and Hash technique that cryptographically binds fingerprint prompts with their responses, ensuring no adversary can generate colliding fingerprints and allowing model owners to irrefutably demonstrate their creation. Second, we address a realistic threat model in which instruction-tuned models' output distribution can be significantly altered through…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. The framework introduced to guide the design of fingerprints is very thorough and carefully crafted. All of the listed objectives are important for a practical and useful fingerprinting scheme. 2. A key contribution is the usage of cryptographic tools (hashing) to ensure unforgeability: a computationally bounded attacker cannot claim ownership of a model if they cannot influence its responses (e.g. by having injected them). 3. The fingerprint insertion is also carefully designed to satisfy t

Weaknesses

1. One weakness is that, in order to prove ownership of a model, the owner must reveal the matching chain. Once the chain has been revealed, it can no longer be relied on in the future, since the questions and answers are known; anyone trying to avoid being fingerprinted can easily evade the fingerprint once it is known. This necessitates multiple chains, however the impact of multiple chains on transparency is not explored and multiple chains are only discussed briefly in the collusion section.

Reviewer 02Rating 6Confidence 3

Strengths

1. Proposes the "Chain & Hash" cryptographic technique, which uses SHA-256 to bind fingerprint prompts to 256 predefined responses. 2. Supports black-box verification that only requires API access aligning with real-world scenarios. 3. Extends IP protection to LoRA adapters by embedding fingerprints directly into these parameter-efficient fine-tuning modules.

Weaknesses

1. The benchmarks used in the paper to evaluate model utility were proposed between 2019 and 2022, and the paper fails to verify the framework’s performance on new benchmarks released in the past two years. 2. The paper relies on GPT-4 to generate diverse meta-prompts for enhancing fingerprint persistence. However, key implementation details are not mentioned in either the main text or the appendix . 3. The paper only compares its method with the black-box technique proposed by Xu et al. (202

Reviewer 03Rating 4Confidence 5

Strengths

1. I like the evaluation of robustness to prompt changes, which is a direction often overlooked in fingerprinting research. 2. The threat model of false claims of ownership is novel and seems realistic.

Weaknesses

## Evaluations I believe that the evals in the paper are incomplete E.1. All the evals reported in the paper are on classification tasks, and there are no generative evals. Training models on incoherent text could lead to model generations being incoherent, which should be tested through evals like IFEval or GSM8k. E.2. Baseline comparisons are completely absent. There is a claim that Xu et al does not produce harmless fingerprints, but this is not substantiated fully (line 99). Similarly, it

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Rights Management and Security

MethodsSparse Evolutionary Training