FedAttr: Towards Privacy-preserving Client-Level Attribution in Federated LLM Fine-tuning

Su Zhang; Junfeng Guo; Heng Huang

arXiv:2605.06596·cs.CR·May 8, 2026

FedAttr: Towards Privacy-preserving Client-Level Attribution in Federated LLM Fine-tuning

Su Zhang, Junfeng Guo, Heng Huang

PDF

TL;DR

FedAttr is a novel protocol for client-level attribution in federated learning that accurately identifies watermarked data training clients while preserving privacy and maintaining FL performance.

Contribution

It introduces FedAttr, a privacy-preserving attribution method that effectively detects watermarked data clients in federated LLM fine-tuning.

Findings

01

Achieves 100% true positive rate and 0% false positive rate in experiments.

02

Outperforms baseline methods by at least 44.4% in TPR or 19.1% in FPR.

03

Adds only 6.3% overhead to federated training time.

Abstract

Watermark radioactivity testing type of methods can detect whether a model was trained on watermarked documents, and have become key tools for protecting data ownership in the fine-tuning of large language models (LLMs). Existing works have proved their effectiveness in centralized LLM fine-tuning. However, this type of method faces several challenges and remains underexplored in federated learning (FL), a widely-applied paradigm for fine-tuning LLMs collaboratively on private data across different users. FL mainly ensures privacy through secure aggregation (SA), which allows the server to aggregate updates while keeping clients' updates private. This mechanism preserves privacy but makes it difficult to identify which client trained on watermarked documents. In this work, we propose FedAttr, a new client-level attribution protocol for FL. FedAttr identifies which clients trained on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.