On the Initialization for Convex-Concave Min-max Problems
Mingrui Liu, Francesco Orabona

TL;DR
This paper investigates how the initialization affects convergence rates in convex-concave min-max problems, showing that strict convexity and concavity enable initialization-dependent rates and proposing algorithms that leverage this for faster convergence.
Contribution
It demonstrates that strict convexity and concavity enable initialization-dependent convergence rates and introduces parameter-free algorithms that achieve improved rates without tuning.
Findings
Initialization-dependent convergence rates are achievable under strict convexity and concavity.
Parameter-free algorithms can attain improved asymptotic rates without learning rate tuning.
The proposed algorithms demonstrate faster convergence in experiments.
Abstract
Convex-concave min-max problems are ubiquitous in machine learning, and people usually utilize first-order methods (e.g., gradient descent ascent) to find the optimal solution. One feature which separates convex-concave min-max problems from convex minimization problems is that the best known convergence rates for min-max problems have an explicit dependence on the size of the domain, rather than on the distance between initial point and the optimal solution. This means that the convergence speed does not have any improvement even if the algorithm starts from the optimal solution, and hence, is oblivious to the initialization. Here, we show that strict-convexity-strict-concavity is sufficient to get the convergence rate to depend on the initialization. We also show how different algorithms can asymptotically achieve initialization-dependent convergence rates on this class of functions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM
