A Coupled Design of Exploiting Record Similarity for Practical Vertical Federated Learning
Zhaomin Wu, Qinbin Li, Bingsheng He

TL;DR
This paper introduces FedSim, a novel coupled training paradigm for vertical federated learning that incorporates record similarity linkage into training, improving performance and applicability in real-world scenarios with fuzzy identifiers.
Contribution
FedSim is the first approach to integrate one-to-many record linkage into VFL training, enhancing accuracy and enabling applications with fuzzy identifiers.
Findings
FedSim outperforms state-of-the-art baselines on eight datasets.
Incorporating similarity linkage improves VFL performance.
Theoretical analysis reveals privacy risks of sharing similarities.
Abstract
Federated learning is a learning paradigm to enable collaborative learning across different parties without revealing raw data. Notably, vertical federated learning (VFL), where parties share the same set of samples but only hold partial features, has a wide range of real-world applications. However, most existing studies in VFL disregard the "record linkage" process. They design algorithms either assuming the data from different parties can be exactly linked or simply linking each record with its most similar neighboring record. These approaches may fail to capture the key features from other less similar records. Moreover, such improper linkage cannot be corrected by training since existing approaches provide no feedback on linkage during training. In this paper, we design a novel coupled training paradigm, FedSim, that integrates one-to-many linkage into the training process. Besides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
MethodsGreedy Policy Search
