TimberStrike: Dataset Reconstruction Attack Revealing Privacy Leakage in Federated Tree-Based Systems

Marco Di Gennaro; Giovanni De Lucia; Stefano Longari; Stefano Zanero; Michele Carminati

arXiv:2506.07605·cs.CR·July 15, 2025

TimberStrike: Dataset Reconstruction Attack Revealing Privacy Leakage in Federated Tree-Based Systems

Marco Di Gennaro, Giovanni De Lucia, Stefano Longari, Stefano Zanero, Michele Carminati

PDF

1 Repo

TL;DR

TimberStrike is an optimization-based attack that reveals sensitive training data in federated tree-based models, exposing privacy vulnerabilities across multiple frameworks and datasets.

Contribution

This paper introduces TimberStrike, the first dataset reconstruction attack targeting federated tree-based models, demonstrating significant privacy risks and analyzing mitigation strategies.

Findings

01

Reconstructs 73-95% of training data in experiments

02

Vulnerable across multiple federated learning frameworks

03

Partial mitigation by Differential Privacy reduces data leakage

Abstract

Federated Learning has emerged as a privacy-oriented alternative to centralized Machine Learning, enabling collaborative model training without direct data sharing. While extensively studied for neural networks, the security and privacy implications of tree-based models remain underexplored. This work introduces TimberStrike, an optimization-based dataset reconstruction attack targeting horizontally federated tree-based models. Our attack, carried out by a single client, exploits the discrete nature of decision trees by using split values and decision paths to infer sensitive training data from other clients. We evaluate TimberStrike on State-of-the-Art federated gradient boosting implementations across multiple frameworks, including Flower, NVFlare, and FedTree, demonstrating their vulnerability to privacy breaches. On a publicly available stroke prediction dataset, TimberStrike…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

necst/timberstrike
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.