Federated ADMM from Bayesian Duality
Thomas M\"ollenhoff, Siddharth Swaroop, Finale Doshi-Velez, Mohammad Emtiyaz Khan

TL;DR
This paper introduces a Bayesian generalization of federated ADMM using variational-Bayesian duality, leading to new algorithms with improved convergence and accuracy for diverse distributed optimization problems.
Contribution
It develops a Bayesian framework that extends federated ADMM via variational-Bayesian duality, enabling new variants with enhanced performance.
Findings
ADMM-like updates are recovered with isotropic-Gaussian VB objectives.
New variants include Newton-like and Adam-like algorithms.
Up to 7% accuracy improvements in deep heterogeneous federated learning.
Abstract
We propose a new Bayesian approach to generalize the federated Alternating Direction Method of Multipliers (ADMM). We show that the solutions of variational-Bayesian (VB) objectives are associated with a duality structure that not only resembles the structure of ADMM's fixed-points but also generalizes it. For example, ADMM-like updates are recovered when the VB objective is optimized over the isotropic-Gaussian family, and new non-trivial extensions are obtained for other exponential-family distributions. These extensions include a Newton-like variant that converges in one step on quadratic objectives and an Adam-like variant that yields up to 7% accuracy boosts for deep heterogeneous cases. Our work opens a new Bayesian way to generalize ADMM and other primal-dual methods.
Peer Reviews
Decision·ICLR 2026 Poster
The paper proposes a novel ADMM-like extension to federated learning, with good experimental results.
It seemed that the argument was more by analogy than exact equality. The claim that ADMM is recovered exactly is misleading because it requires an approximation and therefore is not necessarily recovered exactly. A couple minor points: - In equation 3, a subscript k is missing - On line 195 A* should be defined in the main text rather than just in the appendix. - In figure 4, it would be helpful to mention that each line is numbered with the iteration number.
Authors introduced a Bayesian duality, from which an extension of ADMM that optimizes over distributions naturally follows. For Gaussians with fixed variance, they recover regular ADMM and general Gaussians give Newton-like methods and IVON-ADMM. These show good performance when compared to recent baselines. Other approximating distributions may lead to new interesting splitting algorithms, and more generally, which opens up new research paths to extend and improve primal-dual algorithms using B
In the federated learning ADMM framework, there are theoretical guarantees for communication complexity and iterative complexity. Can the author briefly discuss the communication complexity and iteration complexity of Bayesian ADMM.
__Conceptual novelty:__ The paper establishes a novel Bayesian duality perspective that unifies ADMM and VB under a single framework. This is an interesting connection that could inspire extensions of primal-dual optimization methods. __Clear motivation and exposition:__ The introduction and backgrounds are well written and clearly position the work relative to prior ADMM and PVI approach (Swaroop et al., 2025). __Framework generality:__ The proposed Bayesian duality formulation provides a pri
__Soundness of the Formulation:__ While the high-level idea is promising, the derivation in section 3.3 raises concerns about mathematical consistency: - The "Bayesian ADMM" updates (Eqns. 12-14) are expected to follow from alternating optimization of the Lagrangian in Eqn. 11. However, the replacement of the dual update term $\mu_k - \bar{\mu}$ with $\lambda_k - \bar{\lambda}$ lacks justification within the Lagrangian formulation. The reasoning provided in Appendix E.2, appealing to Bayesian in
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Distributed systems and fault tolerance · Error Correcting Code Techniques
MethodsAlternating Direction Method of Multipliers
