Accelerated Mirror Descent Method through Variable and Operator Splitting
Long Chen, Hao Luo, Jingrong Wei, Zeyi Xu, Yuan Yao

TL;DR
This paper introduces an accelerated mirror descent method that leverages variable and operator splitting, achieving the first linear convergence under a new geometric condition, and demonstrates superior performance through numerical experiments.
Contribution
It develops an accelerated mirror descent algorithm using a novel splitting framework and a geometric condition, establishing the first linear convergence results for broad problem classes.
Findings
Acc-MD outperforms existing methods in experiments.
First linear convergence proven under GCS condition.
Numerical results confirm theoretical advantages.
Abstract
Mirror descent uses the mirror function to encode geometry and constraints, improving convergence while preserving feasibility. Accelerated Mirror Descent Methods (Acc-MD) are derived from a discretization of an accelerated mirror ODE system using a variable--operator splitting framework. A geometric assumption, termed the Generalized Cauchy-Schwarz (GCS) condition, is introduced to quantify the compatibility between the objective and the mirror geometry, under which the first accelerated linear convergence for Acc-MD on a broad class of problems is established. Numerical experiments on smooth and composite optimization tasks demonstrate that Acc-MD consistently outperforms existing accelerated variants, both theoretically and empirically.
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
The paper is clearly written and tries to address an important gap in accelerated mirror descent optimization The proposed Accelerated Mirror Descent (Acc-MD) is nice and simple and achieves linear convergence without relying on triangle-scaling assumptions as most prior methods do. Some experiments on entropic and quartic objectives show that Acc-MD achieves good results compared to existing methods
It is not clear how sensitive Acc-MD is to inaccurate estimates of the relative smoothness and strong convexity constants (which is a very likely case in a real-life problem) as they control the step sizes and can make things break. Ablation studies is needed here The experiments are limited to small synthetic convex problems (entropic and quartic objectives), with no large-scale or real-world benchmarks to show practical scalability or wall-clock gains. The authors need to test on datasets l
The paper is well written, and the contributions are clear. I enjoy reading the section on "Discussion on assumption (A2)," which explains when the main assumption is satisfied and why it is valid (however, as explained below, this section could simply be examples of A2 - just a paragraph with bullet points - rather than theorems with lengthy presentations). The numerical experiments verify the theoretical claims.
I find the presentation in several parts of the paper a bit confusing, introducing update rules and notations without properly presenting them. For example, the mirror descent update in the introduction of the paper is unnecessary and potentially confusing, as this is not a clear update rule (and does not serve a purpose to be there). The proper update rule was later given in section 2 where it was explained how the value of $x_{k+1}$ is actually updated using previous information. The paper ha
- The paper gives a clear derivation of the split flow, a semi-implicit discretization, and an implementable Algorithm 1. The derivation and algorithmic steps are presented in a self-contained way. - The authors provide Lyapunov functions and detailed proofs that support the claimed decay and linear/sublinear rates under stated assumptions. The perturbation/homotopy path to handle is presented with the associated bounds. - The paper includes numerical experiments (entropic MD, quartic objective,
- The paper refers to the continuous-time accelerated/variational literature (Wibisono et al., Krichene et al., Nesterov) and positions Acc-MD as a new discretization. However, the manuscript does not clearly separate what is genuinely new vs. what is a modest variant of existing accelerated mirror flows and discretizations. The related work and motivation sections cite the prior art, but the paper lacks a precise theorem-level comparison that isolates the distinct mathematical mechanism or perf
1. The authors provide rigorous theoretical results based on Lyapunov functions and Bregman methods, adapting them to the paper’s assumptions. 2. The authors provide scenarios where assumption A2 enables acceleration for Acc-MD while TSE-based methods fail. These observations are interesting and impressive.
1. The Acc-MD framework builds directly on the VOS framework and on discretization of continuous-time systems. It is a natural combination of existing methods, but the innovation is limited. 2. It is non-trivial to verify the assumption A2, which is the foundation of most analysis in this paper. Both Theorem 3.5 and Theorem 3.6 depend on smoothness of $f$ and strong convexity of $\phi$, which makes A2 unrealistic. 3. The authors present only one example where TSE fails while A2 holds. This is
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Stochastic Gradient Optimization Techniques · Advanced Numerical Methods in Computational Mathematics
