Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation
Aleksandr Beznosikov, Georgiy Kormakov, Alexander Grigorievskiy, Mikhail Rudakov, Ruslan Nazykov, Alexander Rogozin, Anton Vakhrushev, Andrey Savchenko, Martin Tak\'a\v{c}, Alexander Gasnikov

TL;DR
This paper introduces a saddle point reformulation for Vertical Federated Learning, enabling new stochastic algorithms with convergence guarantees, efficient communication, and asynchronous updates, advancing the practical deployment of VFL.
Contribution
It presents a novel saddle point reformulation of VFL that facilitates stochastic methods, compression, partial participation, and coordinate selection, which were difficult in traditional formulations.
Findings
Saddle point reformulation improves algorithm flexibility.
Proposed methods achieve convergence guarantees.
Numerical experiments validate effectiveness.
Abstract
The objective of Vertical Federated Learning (VFL) is to collectively train a model using features available on different devices while sharing the same users. This paper focuses on the saddle point reformulation of the VFL problem via the classical Lagrangian function. We first demonstrate how this formulation can be solved using deterministic methods. More importantly, we explore various stochastic modifications to adapt to practical scenarios, such as employing compression techniques for efficient information transmission, enabling partial participation for asynchronous communication, and utilizing coordinate selection for faster local computation. We show that the saddle point reformulation plays a key role and opens up possibilities to use mentioned extension that seem to be impossible in the standard minimization formulation. Convergence estimates are provided for each algorithm,…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
1.Reformulating Vertical Federated Learning (VFL) as a saddle point problem is interesting and novel, it can offer an alternative to traditional minimization methods that could address VFL-specific challenges more effectively. 2.The paper presents comprehensive theoretic results. 3.The practical modifications for improving communication efficiency, asynchronous participation, and computational costs are well-aligned with real-world VFL challenges.
1. While the paper introduces several modifications to the basic deterministic algorithm, such as quantization, biased compression, and asynchronous participation, these are presented with high mathematical density and minimal illustrative examples. This makes it challenging for audience like me that are less familiar with saddle point methods and vertical federated learning to fully grasp each modification's practical implications and implementation nuances. I suggest the authors to enhance acc
- The saddle point reformulation seems to be natural and well-motivated. - When the model is linear, the authors established extensive convergence theory for the proposed algorithm and its extensions, accommodating key features such as communication compression, partial participation, and local steps. Besides, the convergence rate of EG improves upon GD in terms of $\lambda_{\max}(A^\top A)$.
- The proposed algorithms only have convergence guarantees for VFL with the linear model and the extension for nonconvex problems remains heuristic. - In the experiments, only general-purpose optimizers are compared while existing algorithms specifically designed for VFL (e.g., [1] and its baselines) are completely missing. - Figure 1 and Figure 2 only present the relative objective gap w.r.t. the number of iterations. This might be unfair since the per-iteration computational and communicatio
The authors start with the basic reformulation in Section 2, and then thoroughly consider several stochastic modifications such as quantization for effective communications, biased compression, partial participation for asynchronous communications and coordinate descent for reducing local computational cost. For each case, a modified algorithm is presented with the complete proof on the convergence rate $O(1/K)$.
There seems a gap between the considered linear models and non-convex models in the formulation (4) on page 3 and (7) on page 9. See Questions for the details.
1. This paper proposes a new minimax framework for the VFL problem. The method has a better complexity constant compared to accelerated gradient descent. 2. The theoretical guarantee in the modification of quantization for the saddle point problem in VFL is novel.
1. Insufficient Preliminaries: The paper lacks clear explanations of key concepts such as Vertical Federated Learning (VFL) modeling and biased/unbiased compression. It would be easier to understand the paper if a "Preliminaries" section defining VFL and compression techniques were added before diving into the technical details. Especially: 1.1 in Section 3.1, the introduction of compression techniques is missing, and key notations, such as $b^k$ appear without proper definition. This makes i
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced MIMO Systems Optimization
