Eliminating Label Leakage in Tree-Based Vertical Federated Learning

Hideaki Takahashi; Jingjing Liu; Yang Liu

arXiv:2307.10318·cs.LG·October 24, 2023·1 cites

Eliminating Label Leakage in Tree-Based Vertical Federated Learning

Hideaki Takahashi, Jingjing Liu, Yang Liu

PDF

Open Access

TL;DR

This paper identifies a new label inference attack in tree-based vertical federated learning and proposes two defense mechanisms to effectively prevent label leakage, enhancing privacy without sacrificing model utility.

Contribution

It introduces ID2Graph, a novel attack method for label inference in tree-based VFL, and proposes Grafting-LDP and andID-LMID defenses to mitigate this vulnerability.

Findings

01

ID2Graph effectively infers labels in tree-based VFL.

02

Proposed defenses significantly reduce label leakage.

03

Defense methods maintain high model utility.

Abstract

Vertical federated learning (VFL) enables multiple parties with disjoint features of a common user set to train a machine learning model without sharing their private data. Tree-based models have become prevalent in VFL due to their interpretability and efficiency. However, the vulnerability of tree-based VFL has not been sufficiently investigated. In this study, we first introduce a novel label inference attack, ID2Graph, which utilizes the sets of record IDs assigned to each node (i.e., instance space)to deduce private training labels. ID2Graph attack generates a graph structure from training samples, extracts communities from the graph, and clusters the local dataset using community information. To counteract label leakage from the instance space, we propose two effective defense mechanisms, Grafting-LDP, which improves the utility of label differential privacy with post-processing,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data