TL;DR
This paper introduces a novel regularization technique called 'MAN' that minimizes layerwise activation norms to promote flatter minima, thereby enhancing the generalization of federated learning models.
Contribution
It proposes a new flatness-constrained federated learning optimization method using activation norm minimization, with theoretical analysis and practical improvements over existing techniques.
Findings
Significant improvement in model generalization in federated learning.
Theoretical proof that minimizing activation norms reduces Hessian eigenvalues.
Achieved state-of-the-art results on federated learning benchmarks.
Abstract
Federated Learning (FL) is an emerging machine learning framework that enables multiple clients (coordinated by a server) to collaboratively train a global model by aggregating the locally trained models without sharing any client's training data. It has been observed in recent works that learning in a federated manner may lead the aggregated global model to converge to a 'sharp minimum' thereby adversely affecting the generalizability of this FL-trained model. Therefore, in this work, we aim to improve the generalization performance of models trained in a federated setup by introducing a 'flatness' constrained FL optimization problem. This flatness constraint is imposed on the top eigenvalue of the Hessian computed from the training loss. As each client trains a model on its local data, we further re-formulate this complex problem utilizing the client loss functions and propose a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Minimizing Layerwise Activation Norm Improves Generalization in Federated Learning· youtube
