A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks
William Lindskog, Christian Prehofer

TL;DR
This paper benchmarks federated tree-based models and neural networks on tabular data, revealing that federated boosted TBMs, especially XGBoost, outperform DNNs across various data partitions and client counts.
Contribution
It provides the first comprehensive benchmark comparing federated TBMs and DNNs on tabular data, highlighting the superior performance of federated boosted TBMs.
Findings
Federated boosted TBMs outperform DNNs on tabular data.
Federated XGBoost achieves the best overall performance.
Federated TBMs maintain superior performance even with many clients.
Abstract
Federated Learning (FL) has lately gained traction as it addresses how machine learning models train on distributed datasets. FL was designed for parametric models, namely Deep Neural Networks (DNNs).Thus, it has shown promise on image and text tasks. However, FL for tabular data has received little attention. Tree-Based Models (TBMs) have been considered to perform better on tabular data and they are starting to see FL integrations. In this study, we benchmark federated TBMs and DNNs for horizontal FL, with varying data partitions, on 10 well-known tabular datasets. Our novel benchmark results indicates that current federated boosted TBMs perform better than federated DNNs in different data partitions. Furthermore, a federated XGBoost outperforms all other models. Lastly, we find that federated TBMs perform better than federated parametric models, even when increasing the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
