VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks
Zhaomin Wu, Junyi Hou, Bingsheng He

TL;DR
This paper introduces VertiBench, a new benchmark for vertical federated learning that includes diverse real-world datasets, evaluation metrics, and considers feature importance and correlation to improve algorithm assessment.
Contribution
The paper presents VertiBench, a comprehensive VFL benchmark with new datasets, metrics, and splitting methods addressing existing limitations in feature distribution diversity.
Findings
Existing benchmarks lack real-world diversity.
Feature importance and correlation significantly impact VFL performance.
The new dataset enhances evaluation of image-image VFL algorithms.
Abstract
Vertical Federated Learning (VFL) is a crucial paradigm for training machine learning models on feature-partitioned, distributed data. However, due to privacy restrictions, few public real-world VFL datasets exist for algorithm evaluation, and these represent a limited array of feature distributions. Existing benchmarks often resort to synthetic datasets, derived from arbitrary feature splits from a global set, which only capture a subset of feature distributions, leading to inadequate algorithm performance assessment. This paper addresses these shortcomings by introducing two key factors affecting VFL performance - feature importance and feature correlation - and proposing associated evaluation metrics and dataset splitting methods. Additionally, we introduce a real VFL dataset to address the deficit in image-image VFL scenarios. Our comprehensive evaluation of cutting-edge VFL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning
