Second-Order Min-Max Optimization with Lazy Hessians
Lesi Chen, Chengchang Liu, Jingzhao Zhang

TL;DR
This paper introduces a more efficient second-order optimization method for convex-concave minimax problems that reuses Hessian computations, reducing overall computational complexity and improving upon previous methods.
Contribution
It proposes a novel approach to reuse Hessians across iterations, significantly lowering computational costs for second-order minimax optimization methods.
Findings
Reduced computational complexity by a factor of d^{1/3}
Achieved faster convergence rates for convex-concave minimax problems
Numerical experiments confirm improved efficiency over existing methods
Abstract
This paper studies second-order methods for convex-concave minimax optimization. Monteiro and Svaiter (2012) proposed a method to solve the problem with an optimal iteration complexity of to find an -saddle point. However, it is unclear whether the computational complexity, , can be improved. In the above, we follow Doikov et al. (2023) and assume the complexity of obtaining a first-order oracle as and the complexity of obtaining a second-order oracle as . In this paper, we show that the computation cost can be reduced by reusing Hessian across iterations. Our methods take the overall computational complexity of , which improves those of previous methods by a factor of . Furthermore, we generalize our method to…
Peer Reviews
Decision·ICLR 2025 Oral
1. A new algorithm in min-max optimization with better computational complexity versus existing results. 2. The paper is well organized, the flow is easy to follow.
1. The main component seems to be a combination of Doikov et al. (2023) on lazy Hessian and Adil et al., (2022) on extragradient, which may restrict the novelty a bit. 2. The experiment can be further enhanced. - First, the $O(d^{1/3})$ improvement suggests the outperformance is valid in high-dimensional cases (while not in low-dimensional cases), now the experiment cannot exhibit such a pattern, how does the algorithm perform in low-dimensional case? - It is not clear how the choice of $
I guess the paper is good from mathematical point of view. The results is strong, original, well presented. I agree with authors that they developed significantly new tricks to work with this class of problems. I guess the paper is good!
For me the main drawback is motivation. I do not understand why we should use second-order method with expensive iteration rather than the first-order one. I understand the motivation for convex optimization where the number of iteration significantly reduces by using optimal second-order scheme, but I do not understand it for SPP where the difference is minor.
Authors propose a method with the same oracle complexity, as state-of-the-art methods but with less computational complexity. This means, that their algorithm can achieve the same estimation error with the same number of iterations, but overall spending less computational resources and taking less time. The experimental results only support this point. This makes method more attractive from the practical point of view. The paper is written in a clear way, and it is easy to understand.
Overall, the paper feels like a very incremental result. Authors employ a known technique to reduce number of Hessian computations to existing second-order method to solve convex-concave min-max problem. To adapt proposed method to strongly-convex-strongly-concave problem, authors use a universal restarts framework, that works like a "wrap" around any method for convex(-concave) problems and gives better theoretical convergence for strongly-convex(-strongly-concave). Despite the fact that this p
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetaheuristic Optimization Algorithms Research · Quantum Computing Algorithms and Architecture
