Entity Augmentation for Efficient Classification of Vertically Partitioned Data with Limited Overlap
Avi Amalanshu, Viswesh Nagaswamy, G. V. S. S. Prudhvi, Yash Sirvi,, Debashish Chakravarty

TL;DR
This paper introduces an entity augmentation method for vertical federated learning that improves classification accuracy with limited data overlap, eliminating the need for costly entity alignment procedures.
Contribution
The paper presents a novel entity augmentation technique that removes the need for set intersection and entity alignment in VFL for categorical tasks, enhancing efficiency and accuracy.
Findings
Significantly improves accuracy with limited data overlap (e.g., 5%).
Achieves comparable or better performance than traditional methods even with full overlap.
Reduces computational complexity by eliminating entity alignment steps.
Abstract
Vertical Federated Learning (VFL) is a machine learning paradigm for learning from vertically partitioned data (i.e. features for each input are distributed across multiple "guest" clients and an aggregating "host" server owns labels) without communicating raw data. Traditionally, VFL involves an "entity resolution" phase where the host identifies and serializes the unique entities known to all guests. This is followed by private set intersection to find common entities, and an "entity alignment" step to ensure all guests are always processing the same entity's data. However, using only data of entities from the intersection means guests discard potentially useful data. Besides, the effect on privacy is dubious and these operations are computationally expensive. We propose a novel approach that eliminates the need for set intersection and entity alignment in categorical tasks. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Anomaly Detection Techniques and Applications
MethodsSparse Evolutionary Training
