Ideal Attribution and Faithful Watermarks for Language Models

Min Jae Song; Kameron Shahabi

arXiv:2512.07038·cs.CR·December 9, 2025

Ideal Attribution and Faithful Watermarks for Language Models

Min Jae Song, Kameron Shahabi

PDF

Open Access

TL;DR

This paper proposes a formal framework for ideal attribution and watermarking in language models, providing a clear foundation for designing and evaluating attribution mechanisms with guaranteed properties.

Contribution

It introduces a formal abstraction called the ledger for deterministic attribution decisions and frames watermarking as a faithful representation of these ideal mechanisms.

Findings

01

Provides a unified language for attribution guarantees

02

Enables precise reasoning about watermarking desiderata

03

Sets a roadmap for future watermarking scheme development

Abstract

We introduce ideal attribution mechanisms, a formal abstraction for reasoning about attribution decisions over strings. At the core of this abstraction lies the ledger, an append-only log of the prompt-response interaction history between a model and its user. Each mechanism produces deterministic decisions based on the ledger and an explicit selection criterion, making it well-suited to serve as a ground truth for attribution. We frame the design goal of watermarking schemes as faithful representation of ideal attribution mechanisms. This novel perspective brings conceptual clarity, replacing piecemeal probabilistic statements with a unified language for stating the guarantees of each scheme. It also enables precise reasoning about desiderata for future watermarking schemes, even when no current construction achieves them, since the ideal functionalities are specified first. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Adversarial Robustness in Machine Learning · Advanced Malware Detection Techniques