FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning
Peiran Xu, Zeyu Wang, Jieru Mei, Liangqiong Qu, Alan Yuille, Cihang, Xie, Yuyin Zhou

TL;DR
This paper investigates how architectural modifications to CNNs can improve their robustness in federated learning with heterogeneous data, showing that CNNs can match or surpass Vision Transformers with proper design.
Contribution
It provides the first systematic analysis of architectural elements influencing CNN performance in heterogeneous federated learning and offers design principles to enhance robustness.
Findings
Modified CNN architectures can outperform standard CNNs in heterogeneous FL.
CNNs with strategic design can match or exceed ViT performance in FL.
The proposed approach is compatible with existing FL techniques and achieves state-of-the-art results.
Abstract
Federated learning (FL) is an emerging paradigm in machine learning, where a shared model is collaboratively learned using data from multiple devices to mitigate the risk of data leakage. While recent studies posit that Vision Transformer (ViT) outperforms Convolutional Neural Networks (CNNs) in addressing data heterogeneity in FL, the specific architectural components that underpin this advantage have yet to be elucidated. In this paper, we systematically investigate the impact of different architectural elements, such as activation functions and normalization layers, on the performance within heterogeneous FL. Through rigorous empirical analyses, we are able to offer the first-of-its-kind general guidance on micro-architecture design principles for heterogeneous FL. Intriguingly, our findings indicate that with strategic architectural modifications, pure CNNs can achieve a level of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Traffic Prediction and Management Techniques
MethodsAttention Is All You Need · Byte Pair Encoding · Dense Connections · Vision Transformer · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection · Layer Normalization · Linear Layer
