Online Alternating Direction Method of Multipliers for Online Composite   Optimization

Yule Zhang; Zehao Xiao; Jia Wu; Liwei Zhang

arXiv:1904.02862·math.OC·February 9, 2024·J. Comput. Appl. Math.

Online Alternating Direction Method of Multipliers for Online Composite Optimization

Yule Zhang, Zehao Xiao, Jia Wu, Liwei Zhang

PDF

Open Access

TL;DR

This paper introduces an online semi-proximal ADMM algorithm with proven sublinear regret bounds for solving linearly constrained convex composite problems, demonstrating its effectiveness through theoretical analysis and numerical experiments.

Contribution

It develops an online ADMM method with regret bounds and analyzes its parameter settings, extending the applicability of ADMM to online convex optimization.

Findings

01

Achieves ${ m O}( oot{N}{})$ regret bounds for objective and constraint violations.

02

Provides guidelines for parameter selection in online ADMM.

03

Validates theoretical results with numerical experiments.

Abstract

In this paper, we investigate regrets of an online semi-proximal alternating direction method of multiplier (Online-spADMM) for solving online linearly constrained convex composite optimization problems. Under mild conditions, we establish $O (N)$ objective regret and $O (N)$ constraint violation regret at round $N$ when the dual step-length is taken in $(0, (1 + 5) /2)$ and penalty parameter $σ$ is taken as $N$ . We explain that the optimal value of parameter $σ$ is of order $O (N)$ . Like the semi-proximal alternating direction method of multiplier (spADMM), Online-spADMM has the advantage to resolve the potentially non-solvability issue of the subproblems efficiently. We show the usefulness of the obtained results when applied to different types of online optimization problems and verify the theoretical result by numerical…

Tables6

Table 1. Table 1: Comparison between the time-averaged objective regret of Online semi-proximal ADMM with the different selections of parameter τ 𝜏 \tau (OLspADMM- τ 𝜏 \tau ) and Online Alternating Direction Method(OADM), where the total number of iterations T = 5000 𝑇 5000 T=5000 .

Dimension	OLspADMM-1.618	OLspADMM-0.3	OLspADMM-0.1	OADM
10	0.253	0.193	0.143	0.748
20	3.258	2.483	1.777	7.946
50	4.555	3.918	3.232	4.847
100	0.566	0.480	0.395	0.450

Table 2. Table 2: Comparison between the running time (sec) of Online semi-proximal ADMM (Online-spADMM) and Online Alternating Direction Method (OADM).

	Online-spADMM	OADM
IterNum	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100
5000	0.023 $\|$ 0.028 $\|$ 0.097 $\|$ 0.264	0.029 $\|$ 0.087 $\|$ 0.222 $\|$ 0.628
10000	0.046 $\|$ 0.055 $\|$ 0.212 $\|$ 0.536	0.055 $\|$ 0.174 $\|$ 0.441 $\|$ 1.261
20000	0.090 $\|$ 0.108 $\|$ 0.420 $\|$ 1.095	0.220 $\|$ 0.358 $\|$ 0.879 $\|$ 2.757
50000	0.225 $\|$ 0.272 $\|$ 1.086 $\|$ 2.933	0.562 $\|$ 0.782 $\|$ 2.249 $\|$ 6.515

Table 3. Table 3: Comparison between the time-averaged objective regret of Online semi-proximal ADMM (Online-spADMM), Online Alternating Direction Method(OADM), Online Forward-Backward Splitting Method (FOBOS) and Regularized Dual Averaging Method (RDA), where the total number of iterations T = 5000 𝑇 5000 T=5000 .

Dimension	Online-spADMM	OADM	FOBOS	RDA
10	0.472	2.086	1.081	1.262
20	0.341	3.850	0.596	4.939
50	2.981	9.708	3.606	22.180

Table 4. Table 4: Same as Table 2 but for Lasso.

	Online-spADMM	OADM
IterNum	n = 10 $\|$ n = 20 $\|$ n = 50	n = 10 $\|$ n = 20 $\|$ n = 50
5000	0.026 $\|$ 0.031 $\|$ 0.056	0.059 $\|$ 0.072 $\|$ 0.155
10000	0.052 $\|$ 0.061 $\|$ 0.112	0.113 $\|$ 0.141 $\|$ 0.310
20000	0.102 $\|$ 0.125 $\|$ 0.223	0.225 $\|$ 0.282 $\|$ 0.619
50000	0.257 $\|$ 0.308 $\|$ 0.567	0.556 $\|$ 0.705 $\|$ 1.551

Table 5. Table 5: Comparison between the time-averaged objective regret of Online semi-proximal ADMM with the different selections of parameter a 𝑎 a (OLspADMM- a 𝑎 a ) and Online Alternating Direction Method(OADM), where the total number of iterations T = 5000 𝑇 5000 T=5000 .

Dimension	OLspADMM-1	OLspADMM-2	OLspADMM-5	OADM
10	0.166	0.366	0.930	1.298
20	0.330	0.730	1.860	2.598
50	1.005	2.161	5.477	6.408
100	4.382	8.863	20.180	12.890

Table 6. Table 6: Same as Table 2 but for TV.

	Online-spADMM	OADM
IterNum	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100
5000	0.028 $\|$ 0.029 $\|$ 0.046 $\|$ 0.109	0.061 $\|$ 0.071 $\|$ 0.146 $\|$ 0.382
10000	0.053 $\|$ 0.057 $\|$ 0.091 $\|$ 0.218	0.118 $\|$ 0.141 $\|$ 0.291 $\|$ 0.763
20000	0.103 $\|$ 0.114 $\|$ 0.180 $\|$ 0.442	0.237 $\|$ 0.282 $\|$ 0.578 $\|$ 1.521
50000	0.257 $\|$ 0.287 $\|$ 0.451 $\|$ 1.112	0.587 $\|$ 0.702 $\|$ 1.459 $\|$ 3.822

Equations310

regret_{T} (A) = {ψ_{1}, \dots, ψ_{T}} \subset Ψ sup {t = 1 \sum T ψ_{t} (u^{t}) - u \in Φ min t = 1 \sum T ψ_{t} (u)}

regret_{T} (A) = {ψ_{1}, \dots, ψ_{T}} \subset Ψ sup {t = 1 \sum T ψ_{t} (u^{t}) - u \in Φ min t = 1 \sum T ψ_{t} (u)}

ψ_{t} (u) = f_{t} (x) + g (z), Φ = {(x, z) \in U : A x + B z = c},

ψ_{t} (u) = f_{t} (x) + g (z), Φ = {(x, z) \in U : A x + B z = c},

\begin{array}[]{ll}\min&F_{N}(x,z)=\displaystyle\sum_{t=1}^{N}[f_{t}(x)+g(z)]\\ {\rm s.t.}&Ax+Bz=c,x\in{\cal X},z\in{\cal Z}.\\ \end{array}

\begin{array}[]{ll}\min&F_{N}(x,z)=\displaystyle\sum_{t=1}^{N}[f_{t}(x)+g(z)]\\ {\rm s.t.}&Ax+Bz=c,x\in{\cal X},z\in{\cal Z}.\\ \end{array}

C := {x \in X : N x + b \in K},

C := {x \in X : N x + b \in K},

f_{t} (x) = ϕ_{t} (x), g (z) = δ_{K} (z), A = N, B = - I

f_{t} (x) = ϕ_{t} (x), g (z) = δ_{K} (z), A = N, B = - I

f_{t} (x) = ϕ_{t} (x), g (z) = R (z) + δ_{X} (z), A = I, B = - I

f_{t} (x) = ϕ_{t} (x), g (z) = R (z) + δ_{X} (z), A = I, B = - I

\begin{array}[]{ll}\min&f(x)+g(z)\\ {\rm s.t.}&Ax+Bz=c,x\in{\cal X},z\in{\cal Z}.\\ \end{array}

\begin{array}[]{ll}\min&f(x)+g(z)\\ {\rm s.t.}&Ax+Bz=c,x\in{\cal X},z\in{\cal Z}.\\ \end{array}

dist_{B} (z, D) = z^{'} \in D in f ∥ z^{'} - z ∥_{B}

dist_{B} (z, D) = z^{'} \in D in f ∥ z^{'} - z ∥_{B}

\begin{array}[]{ll}\min&f_{k}(x)+g(z)\\ {\rm s.t.}&Ax+Bz=c,x\in{\cal X},z\in{\cal Z},\\ \end{array}

\begin{array}[]{ll}\min&f_{k}(x)+g(z)\\ {\rm s.t.}&Ax+Bz=c,x\in{\cal X},z\in{\cal Z},\\ \end{array}

L_{σ}^{k} (x, z; y) := f_{k} (x) + g (z) + ⟨ y, A x + B z - c ⟩ + \frac{σ}{2} ∥ A x + B z - c ∥^{2}, \forall (x, z, y) \in X \times Z \times Y .

L_{σ}^{k} (x, z; y) := f_{k} (x) + g (z) + ⟨ y, A x + B z - c ⟩ + \frac{σ}{2} ∥ A x + B z - c ∥^{2}, \forall (x, z, y) \in X \times Z \times Y .

\begin{array}[]{l}x^{k+1}\in\hbox{arg}\min\,{\cal L}^{k}_{\sigma}(x,z^{k};y^{k})+\displaystyle\frac{\sigma}{2}\|x-x^{k}\|^{2}_{{\cal S}_{k}}\,,\\[5.69054pt] z^{k+1}\in\hbox{arg}\min\,{\cal L}^{k}_{\sigma}(x^{k+1},z;y^{k})+\displaystyle\frac{1}{2}\|z-z^{k}\|^{2}_{\cal T}\,,\\[5.69054pt] y^{k+1}=y^{k}+\tau\sigma(Ax^{k+1}+Bz^{k+1}-c).\end{array}

\begin{array}[]{l}x^{k+1}\in\hbox{arg}\min\,{\cal L}^{k}_{\sigma}(x,z^{k};y^{k})+\displaystyle\frac{\sigma}{2}\|x-x^{k}\|^{2}_{{\cal S}_{k}}\,,\\[5.69054pt] z^{k+1}\in\hbox{arg}\min\,{\cal L}^{k}_{\sigma}(x^{k+1},z;y^{k})+\displaystyle\frac{1}{2}\|z-z^{k}\|^{2}_{\cal T}\,,\\[5.69054pt] y^{k+1}=y^{k}+\tau\sigma(Ax^{k+1}+Bz^{k+1}-c).\end{array}

regret_{N}^{obj} = t = 1 \sum N [f_{t} (x^{t}) + g (z^{t})] - A x + B z = c min {t = 1 \sum N f_{t} (x) + g (z)}

regret_{N}^{obj} = t = 1 \sum N [f_{t} (x^{t}) + g (z^{t})] - A x + B z = c min {t = 1 \sum N f_{t} (x) + g (z)}

regret_{N}^{ctr} = t = 1 \sum N [∥ A x^{t} + B z^{t} - c ∥^{2}],

regret_{N}^{ctr} = t = 1 \sum N [∥ A x^{t} + B z^{t} - c ∥^{2}],

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\left[(2\sigma\tau)^{-1}\|y^{k}\|^{2}+\displaystyle\frac{1}{2}\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\displaystyle\frac{1}{2}\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\displaystyle\frac{\sigma}{2}\|B(z^{k}-\widehat{z})\|^{2}\right.\\[6.0pt] \quad+\displaystyle\frac{1}{2}\left.\|z^{k}-z^{k-1}\|^{2}_{{\cal T}}+\left(1-\min\{\tau,\tau^{-1}\}\right)\displaystyle\frac{\sigma}{2}\|Ax^{k}+Bz^{k}-c\|^{2}\right]\\[10.0pt] -\left[(2\sigma\tau)^{-1}\|y^{k+1}\|^{2}+\displaystyle\frac{1}{2}\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\displaystyle\frac{\sigma}{2}\|B(z^{k+1}-\widehat{z}\|^{2}\right.\\[6.0pt] \end{array}

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\left[(2\sigma\tau)^{-1}\|y^{k}\|^{2}+\displaystyle\frac{1}{2}\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\displaystyle\frac{1}{2}\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\displaystyle\frac{\sigma}{2}\|B(z^{k}-\widehat{z})\|^{2}\right.\\[6.0pt] \quad+\displaystyle\frac{1}{2}\left.\|z^{k}-z^{k-1}\|^{2}_{{\cal T}}+\left(1-\min\{\tau,\tau^{-1}\}\right)\displaystyle\frac{\sigma}{2}\|Ax^{k}+Bz^{k}-c\|^{2}\right]\\[10.0pt] -\left[(2\sigma\tau)^{-1}\|y^{k+1}\|^{2}+\displaystyle\frac{1}{2}\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\displaystyle\frac{\sigma}{2}\|B(z^{k+1}-\widehat{z}\|^{2}\right.\\[6.0pt] \end{array}

\begin{array}[]{l}\quad+\displaystyle\left.\frac{1}{2}\|z^{k+1}-z^{k}\|^{2}_{{\cal T}}+\left(1-\min\{\tau,\tau^{-1}\}\right)\displaystyle\frac{\sigma}{2}\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right]\\[10.0pt] -\left[\displaystyle\frac{1}{2}\tau(1-\tau+\min(\tau,\tau^{-1}))\sigma\|B(z^{k+1}-z^{k})\|^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[16.0pt] \quad+\left.\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}+\displaystyle\frac{1}{2}\|x^{k+1}-\widehat{x}\|^{2}_{\Sigma_{f_{k}}}+\displaystyle\frac{1}{2}\|z^{k+1}-\widehat{z}\|^{2}_{\Sigma_{g}}.\right.\\[10.0pt] \quad+\left.\left(1-\tau+\min(\tau,\tau^{-1})\right)\displaystyle\frac{\sigma}{2}\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right].\end{array}

\begin{array}[]{l}\quad+\displaystyle\left.\frac{1}{2}\|z^{k+1}-z^{k}\|^{2}_{{\cal T}}+\left(1-\min\{\tau,\tau^{-1}\}\right)\displaystyle\frac{\sigma}{2}\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right]\\[10.0pt] -\left[\displaystyle\frac{1}{2}\tau(1-\tau+\min(\tau,\tau^{-1}))\sigma\|B(z^{k+1}-z^{k})\|^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[16.0pt] \quad+\left.\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}+\displaystyle\frac{1}{2}\|x^{k+1}-\widehat{x}\|^{2}_{\Sigma_{f_{k}}}+\displaystyle\frac{1}{2}\|z^{k+1}-\widehat{z}\|^{2}_{\Sigma_{g}}.\right.\\[10.0pt] \quad+\left.\left(1-\tau+\min(\tau,\tau^{-1})\right)\displaystyle\frac{\sigma}{2}\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right].\end{array}

s_{\tau}:=\displaystyle\frac{1}{4}\Big{[}5-\tau-3\min\{\tau,\tau^{-1}\}\Big{]},t_{\tau}:=\displaystyle\frac{1}{2}\Big{[}1-\tau+\min\{\tau,\tau^{-1}\}\Big{]}.

s_{\tau}:=\displaystyle\frac{1}{4}\Big{[}5-\tau-3\min\{\tau,\tau^{-1}\}\Big{]},t_{\tau}:=\displaystyle\frac{1}{2}\Big{[}1-\tau+\min\{\tau,\tau^{-1}\}\Big{]}.

\begin{array}[]{l}\overline{\cal M}_{k}=\mbox{Diag}\left(\sigma{\cal S}_{k}+\Sigma_{f_{k}},{\cal T}+\Sigma_{g}+\sigma B^{*}B\right)+s_{\tau}\overline{E}^{*}\overline{E},\\[6.0pt] \overline{\cal H}_{k}=\mbox{Diag}\left(\sigma{\cal S}_{k}+\displaystyle\frac{1}{2}\Sigma_{f_{k}},{\cal T}+\Sigma_{g}+2t_{\tau}\tau\sigma B^{*}B\right)+\displaystyle\frac{1}{4}t_{\tau}\sigma\overline{E}^{*}\overline{E}.\end{array}

\begin{array}[]{l}\overline{\cal M}_{k}=\mbox{Diag}\left(\sigma{\cal S}_{k}+\Sigma_{f_{k}},{\cal T}+\Sigma_{g}+\sigma B^{*}B\right)+s_{\tau}\overline{E}^{*}\overline{E},\\[6.0pt] \overline{\cal H}_{k}=\mbox{Diag}\left(\sigma{\cal S}_{k}+\displaystyle\frac{1}{2}\Sigma_{f_{k}},{\cal T}+\Sigma_{g}+2t_{\tau}\tau\sigma B^{*}B\right)+\displaystyle\frac{1}{4}t_{\tau}\sigma\overline{E}^{*}\overline{E}.\end{array}

Σ_{f_{k}} + σ S_{k} + σ A^{*} A ≻ 0 & Σ_{g} + T + σ B^{*} B ≻ 0 ⟺ \overline{M}_{k} ≻ 0 ⟺ \overline{H}_{k} ≻ 0.

Σ_{f_{k}} + σ S_{k} + σ A^{*} A ≻ 0 & Σ_{g} + T + σ B^{*} B ≻ 0 ⟺ \overline{M}_{k} ≻ 0 ⟺ \overline{H}_{k} ≻ 0.

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k}-\widehat{x})+B(z^{k}-\widehat{z})\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\left.+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k+1}-\widehat{x})+B(z^{k+1}-\widehat{z})\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\Sigma_{f_{k}}}^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{\Sigma_{g}}^{2}+t_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\left.+\displaystyle\frac{1}{4}t_{\tau}\sigma\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2}\right].\end{array}

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k}-\widehat{x})+B(z^{k}-\widehat{z})\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\left.+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k+1}-\widehat{x})+B(z^{k+1}-\widehat{z})\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\Sigma_{f_{k}}}^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{\Sigma_{g}}^{2}+t_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\left.+\displaystyle\frac{1}{4}t_{\tau}\sigma\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2}\right].\end{array}

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\quad\quad\quad\quad\left.+(1-\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k}+Bz^{k}-c\|^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\left.+(1-\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[(1-\min\{\tau,\tau^{-1}\})\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\left(1-\tau+\min(\tau,\tau^{-1})\right)\sigma\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2}\right].\end{array}

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\quad\quad\quad\quad\left.+(1-\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k}+Bz^{k}-c\|^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\left.+(1-\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right]\\[10.0pt] -\displaystyle\frac{1}{2}\left[(1-\min\{\tau,\tau^{-1}\})\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\left(1-\tau+\min(\tau,\tau^{-1})\right)\sigma\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2}\right].\end{array}

\begin{array}[]{l}2[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+\displaystyle\frac{1}{4}(5-\tau-3\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k}+Bz^{k}-c\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+\displaystyle\frac{1}{4}(5-\tau-3\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}+\displaystyle\frac{1}{2}\left(1-\tau+\min(\tau,\tau^{-1})\right)\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\left.+\displaystyle\frac{1}{4}\left(1-\tau+\min(\tau,\tau^{-1})\right)\sigma\left[\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|Ax^{k}+Bz^{k}-c\|^{2}\right]\right].\end{array}

\begin{array}[]{l}2[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+\displaystyle\frac{1}{4}(5-\tau-3\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k}+Bz^{k}-c\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+\displaystyle\frac{1}{4}(5-\tau-3\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}+\displaystyle\frac{1}{2}\left(1-\tau+\min(\tau,\tau^{-1})\right)\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\left.+\displaystyle\frac{1}{4}\left(1-\tau+\min(\tau,\tau^{-1})\right)\sigma\left[\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|Ax^{k}+Bz^{k}-c\|^{2}\right]\right].\end{array}

\begin{array}[]{l}2[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+\displaystyle\frac{1}{4}(5-\tau-3\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k}+Bz^{k}-c\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+s_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] \end{array}

\begin{array}[]{l}2[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\\[6.0pt] \leq\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+\displaystyle\frac{1}{4}(5-\tau-3\min\{\tau,\tau^{-1}\})\sigma\|Ax^{k}+Bz^{k}-c\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right.\\[6.0pt] \left.+s_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] \end{array}

\begin{array}[]{l}-\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}+t_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\left.+\displaystyle\frac{1}{2}t_{\tau}\sigma\left[\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|Ax^{k}+Bz^{k}-c\|^{2}\right]\right].\end{array}

\begin{array}[]{l}-\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}+t_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\left.+\displaystyle\frac{1}{2}t_{\tau}\sigma\left[\|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|Ax^{k}+Bz^{k}-c\|^{2}\right]\right].\end{array}

A x^{k + 1} + B z^{k + 1} - c = A (x^{k + 1} - x) + B (z^{k + 1} - z), A x^{k} + B z^{k} - c = A (x^{k} - x) + B (z^{k} - z)

A x^{k + 1} + B z^{k + 1} - c = A (x^{k + 1} - x) + B (z^{k + 1} - z), A x^{k} + B z^{k} - c = A (x^{k} - x) + B (z^{k} - z)

\begin{array}[]{l}\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}\geq\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\Sigma_{f_{k}}}^{2},\\[6.0pt] \|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\geq\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{\Sigma_{g}}^{2},\\[6.0pt] \|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|Ax^{k}+Bz^{k}-c\|^{2}\geq\displaystyle\frac{1}{2}\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2},\end{array}

\begin{array}[]{l}\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}\geq\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\Sigma_{f_{k}}}^{2},\\[6.0pt] \|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\geq\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{\Sigma_{g}}^{2},\\[6.0pt] \|Ax^{k+1}+Bz^{k+1}-c\|^{2}+\|Ax^{k}+Bz^{k}-c\|^{2}\geq\displaystyle\frac{1}{2}\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2},\end{array}

\begin{array}[]{l}2\left\{[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\right\}\\[6.0pt] \leq\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k}-\widehat{x})+B(z^{k}-\widehat{z})\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\left.+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k+1}-\widehat{x})+B(z^{k+1}-\widehat{z})\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\Sigma_{f_{k}}}^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{\Sigma_{g}}^{2}+t_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\left.+\displaystyle\frac{1}{4}t_{\tau}\sigma\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2}\right].\end{array}

\begin{array}[]{l}2\left\{[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]\right\}\\[6.0pt] \leq\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|x^{k}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k}-\widehat{z}\|_{{\cal T}}^{2}+\sigma\|B(z^{k}-\widehat{z})\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k}-\widehat{x})+B(z^{k}-\widehat{z})\|^{2}+\|x^{k}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|x^{k+1}-\widehat{x}\|_{\sigma{\cal S}_{k}}^{2}+\|z^{k+1}-\widehat{z}\|_{{\cal T}}^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\left.+\sigma\|B(z^{k+1}-\widehat{z})\|^{2}+\|z^{k+1}-z^{k-1}\|_{{\cal T}}^{2}\right.\\[6.0pt] \quad\quad\left.+s_{\tau}\sigma\|A(x^{k+1}-\widehat{x})+B(z^{k+1}-\widehat{z})\|^{2}+\|x^{k+1}-\widehat{x}\|_{\Sigma_{f_{k}}}^{2}+\|z^{k+1}-\widehat{z}\|_{\Sigma_{g}}^{2}\right]\\[10.0pt] -\left[2t_{\tau}\sigma\tau\|B(z^{k+1}-z^{k})\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}+\|x^{k+1}-x^{k}\|_{\sigma{\cal S}_{k}}^{2}\right.\\[10.0pt] \quad\quad\quad\left.+\displaystyle\frac{1}{2}\|x^{k+1}-x^{k}\|_{\Sigma_{f_{k}}}^{2}+\displaystyle\frac{1}{2}\|z^{k+1}-z^{k}\|_{\Sigma_{g}}^{2}+t_{\tau}\sigma\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\right.\\[10.0pt] \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\left.+\displaystyle\frac{1}{4}t_{\tau}\sigma\|A(x^{k+1}-x^{k})+B(z^{k+1}-z^{k})\|^{2}\right].\end{array}

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]+\displaystyle\frac{1}{2}\sigma t_{\tau}\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\\[10.0pt] \leq\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right]-\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right]\\[10.0pt] \quad\quad+\displaystyle\frac{1}{2}\left[\|(x^{k},z^{k})-(\widehat{x},\widehat{z})\|_{\overline{\cal M}_{k}}^{2}-\|(x^{k+1},z^{k+1})-(\widehat{x},\widehat{z})\|_{\overline{\cal M}_{k}}^{2}\right]\\[10.0pt] \quad\quad-\displaystyle\frac{1}{2}\left[\|(x^{k+1},z^{k+1})-(x^{k},z^{k})\|_{\overline{\cal H}_{k}}^{2}\right].\end{array}

\begin{array}[]{l}[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]+\displaystyle\frac{1}{2}\sigma t_{\tau}\|Ax^{k+1}+Bz^{k+1}-c\|^{2}\\[10.0pt] \leq\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k}\|^{2}+\|z^{k}-z^{k-1}\|_{{\cal T}}^{2}\right]-\displaystyle\frac{1}{2}\left[(\tau\sigma)^{-1}\|y^{k+1}\|^{2}+\|z^{k+1}-z^{k}\|_{{\cal T}}^{2}\right]\\[10.0pt] \quad\quad+\displaystyle\frac{1}{2}\left[\|(x^{k},z^{k})-(\widehat{x},\widehat{z})\|_{\overline{\cal M}_{k}}^{2}-\|(x^{k+1},z^{k+1})-(\widehat{x},\widehat{z})\|_{\overline{\cal M}_{k}}^{2}\right]\\[10.0pt] \quad\quad-\displaystyle\frac{1}{2}\left[\|(x^{k+1},z^{k+1})-(x^{k},z^{k})\|_{\overline{\cal H}_{k}}^{2}\right].\end{array}

ν_{N}^{*} = \frac{1}{N} k = 1 \sum N [f_{k} (\overline{x}) + g (\overline{z})] .

ν_{N}^{*} = \frac{1}{N} k = 1 \sum N [f_{k} (\overline{x}) + g (\overline{z})] .

[f_{k} (x) + g (z)] - [f_{k} (x^{k}) + g (z^{k})] \leq γ_{0}, \forall k = 1, \dots, N .

[f_{k} (x) + g (z)] - [f_{k} (x^{k}) + g (z^{k})] \leq γ_{0}, \forall k = 1, \dots, N .

σ S_{k} + \frac{2 τ}{1 + 8 τ} t_{τ} σ A^{*} A ⪰ \frac{2 τ}{1 + 8 τ} t_{τ} σ I,

σ S_{k} + \frac{2 τ}{1 + 8 τ} t_{τ} σ A^{*} A ⪰ \frac{2 τ}{1 + 8 τ} t_{τ} σ I,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Advanced MIMO Systems Optimization · PAPR reduction in OFDM

Full text

Online Alternating Direction Method of Multipliers for Online Composite Optimization††thanks: Supported by National Key R&D Program of China under project No. 2022YFA1004000.

Yule Zhang111School of Science, Dalian Maritime University, Dalian 116026, China. ([email protected]) This author was supported by the Natural Science Foundation of China under No. 12201097., Zehao Xiao222Institute of Operations Research and Control Theory, School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China. (zehao [email protected]), Jia Wu333Corresponding author. Institute of Operations Research and Control Theory, School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China. ([email protected]) This author was supported by the Natural Science Foundation of China under No.12071055. and Liwei Zhang444Institute of Operations Research and Control Theory, School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China. ([email protected]) This author was supported by the Natural Science Foundation of China under No. 12371298 and partially supported by Dalian High-level Talent Innovation Project No. 2020RD09.

**Abstract. In this paper, we investigate regrets of an online semi-proximal alternating direction method of multiplier (Online-spADMM) for solving online linearly constrained convex composite optimization problems. Under mild conditions, we establish ${\rm O}(\sqrt{N})$ objective regret and ${\rm O}(\sqrt{N})$ constraint violation regret at round $N$ when the dual step-length is taken in $(0,(1+\sqrt{5})/2)$ and penalty parameter $\sigma$ is taken as $\sqrt{N}$ . We explain that the optimal value of parameter $\sigma$ is of order ${\rm O}(\sqrt{N})$ . Like the semi-proximal alternating direction method of multiplier (spADMM), Online-spADMM has the advantage to resolve the potentially non-solvability issue of the subproblems efficiently. We show the usefulness of the obtained results when applied to different types of online optimization problems and verify the theoretical result by numerical experiments. The inequalities established for Online-spADMM are also used to develop iteration complexity of the average update of spADMM for solving linearly constrained convex composite optimization problems.

Key words. Online semi-proximal alternating direction method of multiplier, objective regret, constraint violation regret, online composite optimization, linear constraints.

AMS Subject Classifications(2000): 90C30.**

1 Introduction

In online optimization, a decision maker (or a online player) makes decisions iteratively. At each round of decision, the outcomes associated with the decisions are unknown to the decision maker. After committing to a decision, the decision maker suffers a loss. These losses are unknown to the decision maker beforehand.

The Online Convex Optimization(OCO) framework models the feasible set as a convex set $\Phi\subset{\cal U}$ , where ${\cal U}$ is a linear space. The costs are modeled as convex functions over ${\cal U}$ . A learning framework for OCO problems can described as: at round $t$ , the online player chooses $u^{t}\in\Phi$ . After the player has committed to this choice, a convex cost function $\psi_{t}\in\Psi:{\cal U}\rightarrow\overline{\Re}$ is revealed. Here $\Psi$ is the family of cost functions available to the adversary. The cost incurred by the online player is $\psi_{t}(u^{t})$ , the value of the cost function for the choice $u^{t}$ . Let $T$ denote the total number of rounds.

Let ${\cal A}$ be an algorithm for OCO, the regret of ${\cal A}$ after $T$ iterations is defined as:

[TABLE]

There are a large number of algorithms for online convex optimization problems under different scenarios, among them the famous ones include Follow-the-leader [14], Follow-the-Regularized-Leader [27],[30], Exponentiated Online Gradient [15], Online Mirror Descent, Perceptron [25] and Winnow [18]. There are a lot of publications concerning algorithms for online convex optimization, see Chapter 7 of [21], Chapter 21 of [29], and survey papers Shalev-Shwartz [28], Hazan [11] and references cited in these two papers.

For most works in literature, as pointed out by [11], there are some restrictions for OCO: the losses determined by an adversary should not be allowed to be unbounded and the decision set must be somehow bounded and/or structured. We know from [28] and [11] that $\psi_{t}$ is usually not allowed to take infinite values and $\Phi$ is only of simple structures. For examples, $\psi_{t}$ is required to be Lipschitz continuous or strongly convex, and/or $\Phi$ is the simplex set, the positive orthant, ball-shaped set, or box-shaped set.

This paper will eliminate the mentioned restrictions by permitting $\psi_{t}$ to take $+\infty$ , this allows us to deal with complicated convex constraint sets. We will explain this point after we introduce the optimization model considered in this paper.

In this paper, we consider the online composite optimization defined by

[TABLE]

with ${\cal U}={\cal X}\times{\cal Z}$ , $u=(x,z)\in{\cal U}$ , where ${\cal X}$ and ${\cal Z}$ are two finite-dimensional real Hilbert spaces each equipped with an inner product $\langle\cdot,\cdot\rangle$ and its induced norm $\|\cdot\|$ , $f_{t}:{\cal X}\rightarrow(-\infty,+\infty)$ and $g:{\cal Z}\rightarrow(-\infty,+\infty]$ are proper closed convex functions, $A:{\cal X}\rightarrow{\cal Y}$ and $B:{\cal Z}\rightarrow{\cal Y}$ are two linear operators respectively, with ${\cal Y}$ being another finite-dimensional real Hilbert space equipped with an inner product $\langle\cdot,\cdot\rangle$ and its induced norm $\|\cdot\|$ and $c\in{\cal Y}$ . Namely, at round $N$ , the online player is trying to solve

[TABLE]

For online optimization problems with no constraint in or simple structured constraints embedded in function $g$ (for example, probability simplex is embedded in the entropy function), there are a large number of publications in machine learning filed for designing algorithms, among them see for example [6], [35], and references in [28] and [11]. Besides the mentioned literatures, there are some recent works related to online optimization (1.2). Mahdavi et. al [22] designed a gradient based algorithm to achieve ${\rm O}(\sqrt{N})$ regret and ${\rm O}(N^{3/4})$ constraint violations for an online optimization problem whose constraint set is defined by a set of inequalities of smooth convex functions. Recently Jenatton et. al [13] and Yu and Neely [36] developed new algorithms to improve the performance in comparison to prior works. However the online optimization model considered in these papers does not cover model (1.2) as $\phi_{t}$ in their problem is required to be smooth and is not permitted to take $+\infty$ values.

Now we explain that the online composite optimization model (1.2) covers many popular online optimization problems.

Example 1.1.

Consider a general online optimization model in which the cost function at round $t$ is $\phi_{t}:{\cal X}\rightarrow\Re$ and the constrained set is of the form:

[TABLE]

where $N:{\cal X}\rightarrow{\cal V}$ is a linear mapping, $b\in{\cal V}$ , ${\cal K}\subset{\cal V}$ is closed convex set and ${\cal V}$ is a Hilbert space. Define

[TABLE]

where $\delta$ is the indicator function, ${\cal I}$ is the identity in ${\cal V}$ and ${\cal Z}={\cal V}$ . Then the online optimization problem is expressed as the form (1.2).

Example 1.2.

Consider an online optimization model in which the cost function at round $t$ is $\phi_{t}:{\cal X}\rightarrow\Re$ and the constrained set is a simple convex set $X$ . For avoiding decision jumping, online player introduces a regularizer $R:{\cal X}\rightarrow\Re$ . Define

[TABLE]

where ${\cal I}$ is the identity in ${\cal X}$ . Then the online optimization problem is expressed as the form (1.2).

The off-line problem, in which all losses are known to the decision maker beforehand, corresponds to the case where $f_{k}(x)=f(x)$ , $\forall k=1,\ldots,N$ . In this case, Problem (1.2) or Problem (1.3) is reduced to

[TABLE]

The convex composite optimization problem (1.4) is an important optimization model widely distributed in scientific and engineering fields, see examples considered in [2]. Alternating direction methods of multipliers for solving Problem (1.4) are an important class of numerical algorithms, which are extensively studied in recent twenty years. The classic ADMM was designed by Glowinski and Marroco [9] and Gabay and Mercier [8] and its construction was much influenced by Rockafellar’s works on proximal point algorithms for solving the more general maximal monotone inclusion problems [23, 24].

An important progress in the ADMM field is the semi-proximal ADMM (in short, spADMM) proposed by Fazel et al. [7]. This method has several advantages. First, it allows the dual step-length to be at least as large as the golden ratio of $1.618$ . Second, spADMM not only covers the classic ADMM but also resolves the potentially non-solvability issue of the subproblems in the classic ADMM. Third, perhaps more important one, it possesses the abilities of handling multi-block convex optimization problems. For example, it has been shown most recently that the spADMM is quite efficient in solving multi-block convex composite semidefinite programming problems [34, 16, 4] with a low to medium accuracy. Importantly, under the calmness of the inverse Karush-Kuhn-Tucker mapping, spADMM has the linear rate of convergence, this result was established by Han, Sun and Zhang [10].

Inspired by spADMM for Problem (1.4), we construct the following Online-spADMM for the online problem (1.2). For a Hilbert space $\mathcal{Z}$ , for any self-adjoint positive semidefinite linear operator ${\cal B}:\mathcal{Z}\to\mathcal{Z}$ , denote $\|z\|_{\cal B}:=\sqrt{\langle z,{\cal B}z\rangle}$ and

[TABLE]

for any $z\in\mathcal{Z}$ and any set $D\subseteq\mathcal{Z}$ . If ${\cal B}$ is the identity mapping in $\mathcal{Z}$ , namely ${\cal B}=\I$ , we use $\hbox{dist}(z,D)$ to denote the distance of $z$ from $D$ . At round $k\in\textbf{N}$ , for problem

[TABLE]

the augmented Lagrangian function is defined by

[TABLE]

Then Online-spADMM may be described as follows.

Online-spADMM: An online semi-proximal alternating direction method of multipliers for solving the online convex optimization problem (1.2).

**Step 0 **

Input $(x^{1},z^{1},y^{1})\in{\cal X}\times{\cal Z}\times\mathcal{Y}.$ Let $\tau\in(0,+\infty)$ be a positive parameter (e.g., $\tau\in(0,(1+\sqrt{5})/2)$ ), ${\cal S}_{1}:\mathcal{X}\to\mathcal{X}$ and ${\cal T}:\mathcal{Z}\to\mathcal{Z}$ be a self-adjoint positive semidefinite, not necessarily positive definite, linear operators. Set $k:=1$ .

** Step 1**

Set

[TABLE]

** Step 2**

Receive a cost function $f_{k+1}$ and incur loss $f_{k+1}(x^{k+1})+g(z^{k+1})$ and constraint violation $\|Ax^{k+1}+Bz^{k+1}-c\|$ .

** Step 3**

Choose a self-adjoint positive semidefinite linear operator ${\cal S}_{k+1}:\mathcal{X}\to\mathcal{X}$ .

** Step 4**

Set $k:=k+1$ and go to Step 1.

To our knowledge, the first online alternating direction method perhaps is the one proposed by Wang and Banerjee [33]. Their algorithm corresponds to a modified version of the above Online spADMM where ${\cal T}=0$ and the term $\displaystyle\frac{\sigma}{2}\|x-x^{k}\|^{2}_{{\cal S}_{k}}$ is replaced by $\sigma B_{\phi}(x,x^{k})$ , where $B_{\phi}$ is the Bergman of a smooth convex function $\phi$ . Based on the Wang and Banerjee [33], Hosseini et. al [12] extend the online ADMM algorithm to a distributed setting based on dual-averaging. This algorithm is applicable to solve the online convex optimization over a network of agents and attain a sub-linear regret bound of ${\rm O}(\sqrt{N})$ for the objective function and linear local constraints violation. Liu et. al [20] design and analyze a new zeroth-order online algorithm. It can not only perform gradient-free operation, but also employing the ADMM to accommodate complex structured regularizers. Compared with the first-order gradient based online algorithm, it requires $m$ times more iterations, where $m$ is the number of optimization variables.

For Online-spADMM, we define objective and constraint violation regret by

[TABLE]

and

[TABLE]

respectively.

As far as we are concerned, the main contributions of this paper can be summarized as follows.

•

Cost functions are proper lower semi-continuous convex extended real-valued functions and this makes the optimization model (1.2) include more online optimization problems. For instance, Problem (1.5) includes linear semi-definite programming, quadratic semi-definite programming and convex composite programming.

•

The proposed Online-spADMM allows the dual step-length to be at least as large as the golden ratio of $1.618$ , which is independent on the time horizon and other parameters.

•

When $\sigma=\sqrt{N}$ and ${\cal S}_{k}$ is chosen in a smart way, under mild assumptions (these assumptions are quite weaker than those in [33]), it is proved that the regret of objective function of $N$ iterations is of order ${\rm O}(\sqrt{N})$ , and the regret of constraints of $N$ iterations is of order ${\rm O}(\sqrt{N})$ . It is proved that the solution regret is of order ${\rm O}(\sqrt{N})$ under strong assumptions.

•

It is proved that, for the average of the first $N$ iterations by spADMM for solving linearly constrained convex composite optimization problems, the iteration complexity of objective function is of order ${\rm O}(\sqrt{N})$ and the iteration complexity of constraint violation is of order ${\rm O}(\sqrt{N})$ when $\sigma=\sqrt{N}$ .

$\bullet$

We apply the Online-spADMM to solve several examples of online linear constrained optimization problems. The theoretical results are verified by numerical experiments. Compared with other numerical results, Online-spADMM performs very well in all respects, especially the running time.

The remaining parts of this paper are organized as follows. In Section 2, we develop two important inequalities, which play a key role in the analysis for objective regret, constraint violation regret and solution error regret. Section 3 establishes bounds of objective regret, constraint violation regret and solution error regret of Online-spADMM for the online optimization. Section 4 is about the complexity of the average iteration of spADMM for solving Problem (1.4) and the recovery of an important inequality in [10]. Section 5 applies Online-spADMM to online quadratic optimization, lasso and generalized Lasso and present experimental results. We make our final conclusions in Section 6.

2 Key Inequalities of Online spADMM

In this section, we demonstrate important inequalities for upper bounds of $[f_{k}(x^{k+1})+g(z^{k+1})]-[f_{k}(\widehat{x})+g(\widehat{z})]$ , where $(\widehat{x},\widehat{z})$ is a feasible point of the set $\Phi$ . These upper bounds play crucial roles in the analysis for constraint violation regret and objective regret of Online spADMM.

Theorem 2.1.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Then, for any $(\widehat{x},\widehat{z})\in\Phi$ , any $k=1,\ldots,N$ ,

[TABLE]

Proof. The proof is quite lengthy. We put it in Appendix A. $\Box$

We define

[TABLE]

Let $\overline{E}:{\cal X}\times{\cal Z}\rightarrow{\cal Y}$ be linear operator defined by $\overline{E}(x,z)=Ax+Bz$ for $(x,z)\in{\cal X}\times{\cal Z}$ and

[TABLE]

Proposition 2.1.

Let $\tau\in(0,(1+\sqrt{5})/2)$ , then

[TABLE]

Proof. Similar to the proof of Proposition 3 of [10]. $\Box$

Theorem 2.2.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Then, for any $(\widehat{x},\widehat{z})\in\Phi$ , any $k=1,\ldots,N$ ,

[TABLE]

Proof. We have from Theorem 2.1 that

[TABLE]

Reorganizing the terms on (2.5), we obtain

[TABLE]

Or equivalently

[TABLE]

Using equalities

[TABLE]

and inequalities

[TABLE]

we obtain from (2.6),

[TABLE]

This is just inequality (2.4). $\Box$

Corollary 2.1.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Then, for any $(\widehat{x},\widehat{z})\in\Phi$ , any $k=1,\ldots,N$ ,

[TABLE]

Proof. The inequality (2.8) follows from (2.4) and the definitions of $\overline{\cal M}_{k}$ and $\overline{\cal H}_{k}$ . $\Box$

3 Regret Analysis

Let $S^{*}_{N}$ denote the solution set of Problem (1.3) and for $(\overline{x},\overline{z})\in S^{*}_{N}$ ,

[TABLE]

In this section, we discuss the iteration complexity of Online-spADMM for solving Problem (1.2). For establishing the constraint violation regret and the objective regret of Online-spADMM in terms of round number $N$ , we make the following assumptions.

Assumption 3.1.

Suppose that the sequence $\{(x^{k},z^{k}):k=1,\ldots,N\}$ generated by Online-spADMM satisfies

[TABLE]

for some $\gamma_{0}>0$ and $(\widehat{x},\widehat{z})\in S^{*}_{N}$ .

Assumption 3.2.

For any $k=1,\ldots,N$ , assume that ${\cal S}_{k}$ satisfies

[TABLE]

where ${\cal I}$ is the identity mapping of ${\cal Y}$ .

Assumption 3.3.

For any $k=1,\ldots,N$ and $g^{k}\in\partial f_{k}(x^{k})$ , suppose that there exists $L>0$ such that

[TABLE]

When $f_{t}$ is Lipschitz continuous with Lipschitz constant $L_{t}$ and

[TABLE]

then Assumption 3.3 is satisfied.

3.1 Constraint and objective regrets

Define

[TABLE]

For this purpose, we give the following lemma.

Lemma 3.1.

For $\overline{\cal H}_{k}$ defined by (2.2) and $(x,z)\in{\cal X}\times{\cal Z}$ , one has

[TABLE]

Proof. We can express $\|(x,z)\|^{2}_{\overline{\cal H}_{k}}$ as

[TABLE]

In view of the inequality

[TABLE]

we have from (3.5) that

[TABLE]

The proof is completed. $\Box$

Making a summation of (2.8) for $k=1,\ldots,N$ , we obtain

[TABLE]

We first give a proposition about a bound for the constraint violation regret by Online-spADMM.

Proposition 3.1.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Let the following matrix orders hold:

[TABLE]

Then

[TABLE]

Proof. For any $(\widehat{x},\widehat{z})\in S^{*}_{N}$ , we have

[TABLE]

Since ${\cal S}_{1}\succeq{\cal S}_{2}\succeq\cdots\succeq{\cal S}_{N}$ , we have

[TABLE]

Then, for $\{w^{1},w^{2},\ldots,w^{N+1}\}\subset{\cal X}\times{\cal Z}$ , one has

[TABLE]

Thus we obtain

[TABLE]

Therefore we have from (3.6) that

[TABLE]

Therefore, inequality (3.8) follows directly from inequality (3.6) and Assumption 3.1. $\Box$

We now give the following proposition about a bound for the sum of objective regret and constraint violation regret by Online-spADMM.

Proposition 3.2.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Let Assumption 3.1 be satisfied and the following matrix orders hold:

[TABLE]

Then the following property holds:

[TABLE]

for $g^{k}\in\partial f_{k}(x^{k}),k=1,\ldots,N$ .

Proof. For $k=1,\ldots,N$ , we have for any $g^{k}\in\partial f_{k}(x^{k})$ ,

[TABLE]

It follows from Lemma 3.1 that

[TABLE]

We have from (3.6),(3.11) and (3.12) that

[TABLE]

Since ${\cal S}_{1}\succeq{\cal S}_{2}\succeq\cdots\succeq{\cal S}_{N}$ , we have $\overline{\cal M}_{1}\succeq\cdots\succeq\overline{\cal M}_{N}$ and from (3.13) we obtain

[TABLE]

The proof is completed. $\Box$

Now we are in a position to state the main theorem about the bounds of the constraint violation regret and the objective regret by Online-spADMM, respectively.

Theorem 3.1.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Suppose that the following matrix orders hold:

[TABLE]

Then the following properties hold:

(i)

Let Assumptions 3.2 and 3.3 be satisfied, then

[TABLE]

(ii)

Let Assumptions 3.1, 3.2 and 3.3 be satisfied, then

[TABLE]

Proof. From (3.1) in Assumption 3.2, we know that

[TABLE]

and Conclusion (i) follows from Proposition 3.2 and Assumption 3.3. In view of (3.18), Assumption 3.1 and Assumption 3.3, we obtain (ii) from Proposition 3.2. $\Box$

Define

[TABLE]

We obtain the following result from Theorem 3.1, which provides upper bounds in terms of $N$ for objective regret and constraint violation regret.

Corollary 3.1.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Suppose that the following matrix orders hold:

[TABLE]

Let $\sigma=\sqrt{N}$ . Then the following properties hold:

(i)

Let Assumptions 3.2 and 3.3 be satisfied, then

[TABLE]

(ii)

Let Assumptions 3.1, 3.2 and 3.3 be satisfied, then

[TABLE]

Proof. From the definitions of $\overline{\cal M}_{1}$ , $\kappa_{1}$ and $\kappa_{2}$ , one has that

[TABLE]

Thus we obtain conclusion (i) from (i) of Theorem 3.1 and the definition of $\kappa_{3}$ and $\kappa_{4}$ . Conclusion (ii) follows from (ii) of Theorem 3.1. $\Box$

Remark 3.1.

We have the following observations:

a)

It follows from Corollary 3.1 that the objective regret and the constraint violation regret are of order $\sqrt{N}$ whose coefficients are dependent on the distance of the initial point $(x^{1},z^{1})$ from $S^{*}_{N}$ if $g$ is nonnegative-valued.

b)

Since $g$ is often a regularizer, the assumption $g$ being nonnegative-valued is quite natural. Assumption 3.3 is a natural assumption, which was adopted by **[28]**.

c)

Assumption 3.1 is satisfied in many circumstances. For example if $g$ has the following form

[TABLE]

where $Z\subset{\cal Z}$ is a nonempty convex compact set and $\theta:Z:\rightarrow\Re$ is a continuous function and $A^{*}$ is an onto linear operator, then Assumption 3.1 is satisfied.

d)

Since we can not neglect the term

[TABLE]

which is of order $\sigma$ , in (3.16) and (3.17) for objective regret and constraint violation regret, the optimal choice of $\sigma$ is of order ${\rm O}(\sqrt{N})$ .

3.2 Solution regret

In this subsection, we discuss the possibility for deriving solution regret. For this purpose, we define

[TABLE]

and derive new inequalities from (2.1). From (2.1), we have

[TABLE]

Then inequality (3.22) can be equivalently written as

[TABLE]

Like Proposition 3.1, basing on (3.23), we may prove a bound for both constraint violation regret and solution regret by Online-spADMM.

Proposition 3.3.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Let the following matrix orders hold:

[TABLE]

Then

[TABLE]

Like Proposition 3.2, we may prove another bound of the objective regret by Online-spADMM.

Proposition 3.4.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Let Assumption 3.1 be satisfied and the following matrix orders hold:

[TABLE]

Then the following two property holds:

[TABLE]

for $g^{k}\in\partial f_{k}(x^{k}),k=1,\ldots,N$ .

We obtain the following result from Proposition 3.4.

Theorem 3.2.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Suppose that the following matrix orders hold:

[TABLE]

where ${\cal S}$ is a positively definite self-adjoint operator. Then the following properties hold:

(i)

Let Assumption 3.3 be satisfied, then

[TABLE]

(ii)

Let Assumptions 3.1 and 3.3 be satisfied, then

[TABLE]

(iii)

Let Assumption 3.3 be satisfied and

[TABLE]

then the solution regret has the following bound

[TABLE]

Define

[TABLE]

We obtain the following result from Theorem 3.2, which provides upper bounds in terms of $N$ for objective regret, constraint violation regret and solution regret.

Corollary 3.2.

Let $\{(x^{k},z^{k},y^{k}):k=1,\ldots,N+1\}$ be generated by Online-spADMM. Suppose that the following matrix orders hold:

[TABLE]

where ${\cal S}$ is a positively definite self-adjoint operator. Then the following properties hold:

(i)

Let Assumption 3.3 be satisfied, then

[TABLE]

(ii)

Let Assumptions 3.1 and 3.3 be satisfied, then

[TABLE]

(iii)

Let Assumption 3.3 be satisfied and

[TABLE]

then the solution regret has the following bound

[TABLE]

We should point out that Assumption 3.2 is weaker than condition (3.33). However, if ${\cal W}_{k}$ satisfies

[TABLE]

for some self-adjoint operator ${\cal W}$ , then under condition (3.36), (3.37) provides a solution regret. But whether condition (3.36) always holds is a problem left.

4 Averaging in spADMM

In this section, we consider the off-line problem, namely the case where $f_{k}(x)=f(x)$ , $\forall k=1,\ldots,N$ . In this case, Problem (1.2) is reduced to

[TABLE]

The augmented Lagrangian function of problem (4.1) is defined by

[TABLE]

For this probem, Online-spADMM is reduced to the semi-proximal alternating direction method of multipliers, namely spADMM proposed by [7].

spADMM: A semi-proximal alternating direction method of multipliers for solving the convex optimization problem (4.1).

**Step 0 **

Input $(x^{1},z^{1},y^{1})\in\hbox{dom}\;f\times\hbox{dom}\;g\times\mathcal{Y}$ . Let $\tau\in(0,+\infty)$ be a positive parameter (e.g., $\tau\in(0,(1+\sqrt{5})/2)$ ), ${\cal S}_{1}:\mathcal{X}\to\mathcal{X}$ and ${\cal T}:\mathcal{Z}\to\mathcal{Z}$ be a self-adjoint positive semidefinite, not necessarily positive definite, linear operators. Set $k:=1$ .

** Step 1**

Set

[TABLE]

** Step 2**

Set $k:=k+1$ and go to Step 1.

There is a slight difference between Online-spADMM and the above spADMM proposed in [7]. In Online-spADMM, in stead of using ${\cal S}_{k}$ in Step 1, we use $\sigma{\cal S}_{k}$ in this step, because this is more convenient for regret bound analysis.

In this section, we first discuss regrets of spADMM iterations and the iteration complexity of the averaging of generated iterative points. After that we will recover an important inequality in [10], which is the key for establishing the linear rate of convergence for spADMM under calmness condition of the inverse KKT mapping.

4.1 Regrets of spADMM Iterations

Let $S^{*}$ and $\nu^{*}$ denote the solution set and the optimal value of Problem (4.1), respectively. We introduce the following notations:

[TABLE]

where $\overline{E}$ is a linear operator defined by $\overline{E}(x,z)=Ax+Bz$ .

In order to derive the constraint violation regret of spADMM for solving Problem (4.1), we need the following assumption similar to Assumption 3.1.

Assumption 4.1.

Suppose that the sequence $\{(x^{k},z^{k})\}$ generated by spADMM satisfies

[TABLE]

for some $\gamma_{0}>0$ and $(\widehat{x},\widehat{z})\in S^{*}$ .

From Corollary 2.1, we obtain the following result directly.

Proposition 4.1.

Let $\{(x^{k},z^{k},y^{k})\}$ be generated by spADMM. Then, for any $(\widehat{x},\widehat{z})\in\Phi$ , any $k=1,\ldots,$

[TABLE]

From (4.5), we obtain the following inequality:

[TABLE]

Define where

[TABLE]

Then, using (4.7) and the definition of $\overline{\cal M}$ , we obtain the following result about regrets of spADMM.

Theorem 4.1.

Let $N$ be a positive integer. Let $\{(x^{k},z^{k},y^{k})\}$ be generated by spADMM with $\sigma=\sqrt{N}$ . Suppose that Assumption 4.1 is satisfied. Then the following properties hold.

(i)

The objective regret satisfies the following bound:

[TABLE]

(ii)

The constraint violation regret has the following bound:

[TABLE]

Define for $t>1$ ,

[TABLE]

Then we can easily to obtain the following estimates from Theorem 4.1.

Theorem 4.2.

Let $N$ be a positive integer. Let $\{(x^{k},z^{k},y^{k})\}$ be generated by spADMM with $\sigma=\sqrt{N}$ . Suppose that Assumption 4.1 is satisfied. Then the following properties hold.

(i)

The error in objective at $(\widehat{x}^{N},\widehat{z}^{N})$ satisfies

[TABLE]

(ii)

The error in constraint at $(\widehat{x}^{N},\widehat{z}^{N})$ satisfies

[TABLE]

4.2 Recovery of an important inequality in [10]

Let $(\overline{x},\overline{z})\in S^{*}$ be a solution to Problem (4.1) and $\overline{y}\in{\cal Y}$ be a vector such that $(\overline{x},\overline{z},\overline{y})$ satisfies the following Karush-Kuhn-Tucker system

[TABLE]

From equalities

[TABLE]

and

[TABLE]

we obtain

[TABLE]

Noting from (4.12) that $(\overline{x},\overline{z})\in\Phi$ , we have from Corollary 2.1, for $\{(x^{k},z^{k},y^{k})\}$ generated by spADMM, that

[TABLE]

In terms of (4.12), we know that $-A^{*}\overline{y}\in\partial f(\overline{x})$ and $-B^{*}\overline{y}\in\partial g(\overline{z})$ . Then we obtain form the convexity of $f$ and $g$ that

[TABLE]

From this and (4.14), we get

[TABLE]

or

[TABLE]

We introduce the following linear operators, the same notations as in [10],

[TABLE]

where ${\cal I}$ is the identity operator in ${\cal Y}$ . Then inequality (4.16) is equivalent to

[TABLE]

which coincides with the important formula (26) in [10]. Thus inequality (4.14) in Corollary 2.1 is not so strong as formula (26) in [10].

5 Examples and Numerical Evaluations

In this section, numerical results of the proposed algorithms are presented. We apply the Online-spADMM to several specific questions and conduct numerical experiments to validate the theoretical performance of our algorithm acting on synthetic data. We get the time-averaged objective regret and time-average constraint violations of our algorithm. In this experiment, we intend to compare the performance of Online-spADMM with the following well-studied algorithms. We will test the following numerical cases provided by [2].

In this part, we evaluate the performance of Online-spADMM for solving online quadratic optimization, Lasso and total variation (TV), respectively. All computational results are obtained by running Matlab R2020a on Windows 10 (Intel Core i5-10400 CPU @ 2.90GHz 16GB RAM).

5.1 Application to online quadratic optimization

Consider the online quadratic optimization (OQO) problem with

[TABLE]

where

[TABLE]

and $X\subset\Re^{n}$ is a closed convex compact set, $G_{t}\in\Re^{n\times n}$ , $c_{t}\in\Re^{n}$ , $A\in\Re^{m\times n}$ and $b\in\Re^{m}$ .

We will test the following numerical case provided in Boyd’s website (http://www.stanford.edu/ $\thicksim$ boyd/papers/admm/quadprog/quadprog_example.html). For better numerical result, we generate a well-conditioned symmetrical positive definite matrix $G_{t}\in\mathbb{S}^{n\times n}$ and vector $c_{t}\in\Re^{n}$ at time $t$ , where $G_{t}$ is uniformly distributed over $(0,1)$ and $c_{t}$ generated by the standard normal distribution.

Define

[TABLE]

and

[TABLE]

We can reformulate the OQO problem into the framework of Problem (1.2), i.e.,

[TABLE]

Then the online quadratic optimization problem is reformulated as Problem (5.3). The augmented Lagrangian is defined as

[TABLE]

Then Online-spADMM may be described as follows.

Online-spADMM for online convex quadratic optimization problem (5.3).

**Step 0 **

Input $(x^{1},z^{1},\mu^{1},\lambda^{1})\in\Re^{n}\times\Re^{n}\times\Re^{m}\times\Re^{n}.$ Let $\tau\in(0,+\infty)$ be a positive parameter (e.g., $\tau\in(0,(1+\sqrt{5})/2)$ ), $S_{1}\in\mathbb{S}^{n}_{+}$ be a symmetric positive semidefinite matrix. Set $k:=1$ .

** Step 1**

Set

[TABLE]

** Step 2**

Receive a cost function $f_{k+1}$ and incur loss $f_{k+1}(x^{k+1})$ and constraint violation $\|Ax^{k+1}-b\|$ .

** Step 3**

Choose a symmetric positive semidefinite matrix $S_{k+1}\in\mathbb{S}^{n}_{+}$ .

** Step 4**

Set $k:=k+1$ and go to Step 1.

For integer $N$ , choose $\alpha>0$ large enough such that

[TABLE]

Define

[TABLE]

Let $\sigma=\sqrt{N}$ . Then subproblems for $x^{k+1}$ and $z^{k+1}$ in (5.4) have the following explicit solutions:

[TABLE]

Therefore, when $X$ is a simple convex set and $\Pi_{X}$ is easy to calculate, then Online-spADMM with $S_{t}$ defined by (5.16) is quite effective as subproblems in (5.4) have explicit expressions.

In our numerical tests, we choose $n=[10,20,50,100]$ , total number of iterations $T=5000$ , $\tau=[0.1,0.3,1.618]$ and $\sigma=\sqrt{N}$ . In the existing literature, dimensions of numerical examples are generally very low, usually 2 or 5, for instances see [3] , [12] and [19].

In this experiment, we intend to compare the performance of online-mspADMM with the Online Alternating Direction Method (OADM) algorithm proposed in [33] is in the following form:

[TABLE]

We set the parameters to $\eta_{1}=\sqrt{N}$ and $\eta_{2}=T$ .

The results are shown in Figure 1. The images show the numerical results of the OADM and Online-spADMM algorithms with the different selections of parameter $\tau$ . We can see that Online-spADMM achieves $\mathcal{O}(\sqrt{N})$ objective regret and $\mathcal{O}(\sqrt{N})$ constraint violation regret at round $N$ , which validates our theoretical results. It is obvious that the Online-spADMM algorithm performs slightly better than the OADM algorithm. Combine Figure 1 and Table 1, we find that the smaller $\tau$ , the smaller time-averaged objective regret, the time-average constraint violation is the opposite. Table 2 reports the comparison between the running time of the Online-spADMM and OADM for all cases. From this table, we can observe that Online-spADMM has a shorter running time.

5.2 Lasso

In this subsection, we study a numerical example of Lasso. Lasso [32] is an important special case of $l_{1}$ regularized linear regression. For a given scalar $\lambda\geq 0$ , the Lasso regularizer is chosen as $\varphi(x)=\lambda\|x\|_{1}$ . This involves solving

[TABLE]

where $A_{t}\in\Re^{m\times n}$ and $b_{t}\in\Re^{m}$ . By introducing an auxiliary variable $z\in\Re^{n}$ , we can reformulate the Lasso problem (5.7) into the framework of Problem (1.2), i.e.,

[TABLE]

where $A_{t}$ and $b_{t}$ are generated by the standard normal distribution. The specific generating code can be found in Boyd’s website (http://www.stanford.edu/$\thicksim$boyd/papers/admm/lasso/lasso$\_$example.html).

Then the Lasso is reformulated as Problem (5.8). The augmented Lagrangian is defined as

[TABLE]

Then Online-spADMM may be described as follows.

Online-spADMM for Lasso (5.7).

**Step 0 **

Input $(x^{1},z^{1},\mu^{1},\lambda^{1})\in\Re^{n}\times\Re^{n}\times\Re^{m}\times\Re^{n}.$ Let $\tau\in(0,+\infty)$ be a positive parameter (e.g., $\tau\in(0,(1+\sqrt{5})/2)$ ), $S_{1}\in\mathbb{S}^{n}_{+}$ be a symmetric positive semidefinite matrix. Set $k:=1$ .

** Step 1**

Set

[TABLE]

** Step 2**

Receive a cost function $f_{k+1}$ and incur loss $f_{k+1}(x^{k+1})$ and constraint violation $\|x^{k+1}-z^{k+1}\|$ .

** Step 3**

Choose a symmetric positive semidefinite matrix $S_{k+1}\in\mathbb{S}^{n}_{+}$ .

** Step 4**

Set $k:=k+1$ and go to Step 1.

For integer $N$ , choose $\alpha>0$ large enough such that

[TABLE]

Define

[TABLE]

Let $\sigma=\sqrt{N}$ . Then subproblems for $x^{k+1}$ and $z^{k+1}$ in (5.9) have the following explicit solutions:

[TABLE]

In our numerical tests, we choose $n=[10,20,50]$ , total number of iterations $T=5000$ , $\tau=[0.1,0.3,1.618]$ and $\sigma=\sqrt{N}$ . We intend to compare the performance of Online-spADMM with the following well-studied algorithms:

$\bullet$

OADM. The Online Alternating Direction Method (OADM) algorithm proposed in [33] is in the following form:

[TABLE]

We set the parameters to $\eta_{1}=\sqrt{N}$ and $\eta_{2}=T/2$ .

$\bullet$

FOBOS. The Online Forward-Backward Splitting Method (FOBOS) algorithm proposed in [6] is in the following form:

[TABLE]

where parameters $\rho_{k}\propto 1/k$ .

$\bullet$

RDA. The Regularized Dual Averaging Method (RDA) algorithm proposed in [35] is in the following form:

[TABLE]

We set the parameters to $\eta=0.005$ and $\beta_{k}=\gamma\sqrt{k}$ , where $\gamma=5000$ .

The results are shown in Figure 2. Since parameter $\tau$ changes have little effect on time-averaged objective regret, we don’t consider the effect of $\tau$ . The images show the numerical results of the OADM and Online-spADMM algorithms with the different selections of parameter $\tau$ . Combine Figure 2 and Table 3, we can see that Lasso and OQO have the similar numerical performance about the time-averaged objective regret. From Table 3 with different dimensions, we observe that Online-spADMM always performs better than the other three algorithms. Since FOBOS and RDA are used to solve unconstrained online optimization problems, they don’t have time-average constraint violation as a performance metric. We just have to care about the Online-spADMM and OADM. It’s easy to observe that the smaller $\tau$ , the larger constraint violation and the performance of Online-spADMM not much worse than OADM. From table 4, we can observe that Online-spADMM has a shorter running time.

5.3 Generalized Lasso

Online-spADMM is more powerful for problems with complex objective function. Such as Problem (5.3), FOBOS and RDA are no longer applicable, since there will be no closed-form for it. The other important lasso problem can be generalized to

[TABLE]

where $F$ is an arbitrary linear transformation. An important special case is when $F\in\Re^{(n-1)\times n}$ is the difference matrix,

[TABLE]

When $A_{t}=I$ , generalized lasso can be expressed as

[TABLE]

This problem is the TV model of the removing noise from images problem [26].

By introducing an auxiliary variable $z\in\Re^{n}$ , we can reformulate the Lasso problem (5.13) into the framework of Problem (1.2), i.e.,

[TABLE]

where $b_{t}$ is generated by the standard normal distribution. The specific generating code can be found in Boyd’s website (http://www.stanford.edu/$\thicksim$boyd/papers/admm/total$\_$variation/total$\_$variation$\_$ex-

ample.html), although we modified the codes to accommodate our setup.

Then the Lasso is reformulated as Problem (5.14). The augmented Lagrangian is defined as

[TABLE]

Then Online-spADMM may be described as follows.

Online-spADMM for TV problem (5.13).

**Step 0 **

Input $(x^{1},z^{1},\mu^{1},\lambda^{1})\in\Re^{n}\times\Re^{n}\times\Re^{m}\times\Re^{n}.$ Let $\tau\in(0,+\infty)$ be a positive parameter (e.g., $\tau\in(0,(1+\sqrt{5})/2)$ ), $S_{1}\in\mathbb{S}^{n}_{+}$ be a symmetric positive semidefinite matrix. Set $k:=1$ .

** Step 1**

Set

[TABLE]

** Step 2**

Receive a cost function $f_{k+1}$ and incur loss $f_{k+1}(x^{k+1})$ and constraint violation $\|Fx^{k+1}-z^{k+1}\|$ .

** Step 3**

Choose a symmetric positive semidefinite matrix $S_{k+1}\in\mathbb{S}^{n}_{+}$ .

** Step 4**

Set $k:=k+1$ and go to Step 1.

For integer $N$ , choose $\alpha>0$ large enough such that

[TABLE]

Define

[TABLE]

Let $\sigma=a\sqrt{N}$ , where $a$ is a scalar. Then subproblems for $x^{k+1}$ and $z^{k+1}$ in (5.15) have the following explicit solutions:

[TABLE]

In our numerical tests, we choose $n=[10,20,50,100]$ , total number of iterations $T=5000$ , $\tau=[0.1,0.3,1.618]$ and $\sigma=\sqrt{N}$ . We intend to compare the performance of Online-spADMM with the Online Alternating Direction Method (OADM) algorithm proposed in [33] is in the following form:

[TABLE]

We set the parameters to $\eta_{1}=\sqrt{N}$ and $\eta_{2}=T/2$ .

The results are shown in Figure 3. Since parameter $\tau$ changes have little effect on the performance of Online-spADMM, we consider the effect of $\sigma$ on this algorithm. Combine Figure 3 and Table 5, we can see that Lasso and OQO have the similar numerical performance about the time-averaged objective regret and constraint violation. It’s easy to observe that the larger $a$ , the larger time-averaged objective regret and the time-averaged objective regret is the opposite. From table 6, we can observe that Online-spADMM has a shorter running time.

6 Conclusions

In this paper, we have established regrets for objective and constraint violation of Online-spADMM for solving online linearly constrained convex composite optimization problems and presented numerical experiments to verify our theoretical results. One significant feature of our approach is that the bounds for objective and constraint violation are obtained under weak assumptions for objective functions. As the bound for solution regret in Theorem 3.2 is not satisfactory, an important issue left unanswered is to find sufficient conditions for ensuring ${\rm O}(\sqrt{N})$ regret bound for solution errors. For spADMM, whether we can obtain ${\rm O}(\sqrt{N})$ constraint violation regret bound when $\sigma$ is a fixed constant is another topic worth studying. This paper only discusses the case for constant constraint set $\Phi=\{(x,z):Ax+Bz=c\}$ , a difficult problem is left to study is the case when $A,B,c$ is changing with time $t$ , just like the online linear optimization considered by [1], or even a more complicated case where constraints are time-varying inequalities considered in [37]. These online optimization models are worth studying as they cover a large number of important practical problems.

Appendix A. Proof of Theorem 2.1. Since

[TABLE]

we have, from the optimality for convex programming, that

[TABLE]

which, combing $y^{k+1}=y^{k}+\tau\sigma(Ax^{k+1}+Bz^{k+1}-c)$ , implies

[TABLE]

It follows from the convexity of $f_{k}$ that

[TABLE]

From the convexity of $g$ , we have

[TABLE]

Adding both sides of (6.2) and (6.3), we obtain

[TABLE]

From this and the identities

[TABLE]

we obtain

[TABLE]

Obviously we have the following equalities

[TABLE]

and for $R^{k}=Ax^{k}+Bz^{k}-c$ , we have

[TABLE]

Since

[TABLE]

one has that

[TABLE]

Thus we have from (6.6) that

[TABLE]

Thus we have from (6.5) and (6.7) that

[TABLE]

Combining (6.4) and (6.8), we obtain

[TABLE]

We consider two cases when $\tau\in(0,1]$ and $(1,(1+\sqrt{5})/2)$ , respectively.

Case (i): when $\tau\in(0,1]$ . We have

[TABLE]

It follows from (6.10), in this case, that

[TABLE]

Case (ii): when $(1,(1+\sqrt{5})/2)$ . We have

[TABLE]

It follows from (6.10), in this case, that

[TABLE]

In view of (6.11) and (6.13), we obtain formula (2.1). $\Box$

Declarations

Acknowledgements. We are grateful to the reviewer for his/her constructive feedback.

Authors’ contributions. All authors contributed equally to this study.

Availability of supporting data. The datasets used or analyzed in the study are available from the author upon reasonable request.

Funding. This work was supported by National Key R&D Program of China (2022YFA1004000), the National Natural Science Foundation of China (12201097,12071055,12371298), and partially supported by Dalian High-level Talent Innovation Project (2020RD09).

Ethical Approval. Not applicable.

Competing interests. The authors declare no competing interests.

Bibliography37

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. Agrawal, Z. Z. Wang and Y. Y. Ye, A dynamic near-optimal algorithm for online linear programming, Operations Research, 62(4)(2014), 876-890.
2[2] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., 3 (2011), 1-122.
3[3] S. Chaudhary, D. Kalathil, Safe Online Convex Optimization with Unknown Linear Safety Constraints, Proceedings of the AAAI Conference on Artificial Intelligence, 36(6)(2022), 6175-6182.
4[4] L. Chen, D. F. Sun and K.-C. Toh, An effcient inexact symmetric Gauss-Seidel based majorized ADMM for high-dimensional convex composite conic programming, Mathematical Programming, 161(1-2) (2017), 237-270
5[5] Y. Cui, X. D. Li, D. F. Sun and K.-C. Toh, On the convergence properties of a majorized ADMM for linearly constrained convex optimization problems with coupled objective functions, Journal of Optimization Theory and Applications, 169 (2016),1013-1041.
6[6] J. Duchi and Y. Singer, Efficient online and batch learning using forward backward splitting, Journal of Machine Learning Research, 10 (2009), 2899-2934.
7[7] M. Fazel, T. K. Pong, D. F. Sun and P. Tseng, Hankel matrix rank minimization with applications to system identification and realization, SIAM Journal on Matrix Analysis and Applications, 34 (2013),946-977.
8[8] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems via finite element approximation, Computational Mathematics and Applications, 2 (1976),17-40.

	Online-spADMM	OADM
IterNum	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100
5000	0.023 $\|$ 0.028 $\|$ 0.097 $\|$ 0.264	0.029 $\|$ 0.087 $\|$ 0.222 $\|$ 0.628
10000	0.046 $\|$ 0.055 $\|$ 0.212 $\|$ 0.536	0.055 $\|$ 0.174 $\|$ 0.441 $\|$ 1.261
20000	0.090 $\|$ 0.108 $\|$ 0.420 $\|$ 1.095	0.220 $\|$ 0.358 $\|$ 0.879 $\|$ 2.757
50000	0.225 $\|$ 0.272 $\|$ 1.086 $\|$ 2.933	0.562 $\|$ 0.782 $\|$ 2.249 $\|$ 6.515

	Online-spADMM	OADM
IterNum	n = 10 $\|$ n = 20 $\|$ n = 50	n = 10 $\|$ n = 20 $\|$ n = 50
5000	0.026 $\|$ 0.031 $\|$ 0.056	0.059 $\|$ 0.072 $\|$ 0.155
10000	0.052 $\|$ 0.061 $\|$ 0.112	0.113 $\|$ 0.141 $\|$ 0.310
20000	0.102 $\|$ 0.125 $\|$ 0.223	0.225 $\|$ 0.282 $\|$ 0.619
50000	0.257 $\|$ 0.308 $\|$ 0.567	0.556 $\|$ 0.705 $\|$ 1.551

	Online-spADMM	OADM
IterNum	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100	n = 10 $\|$ n = 20 $\|$ n = 50 $\|$ n = 100
5000	0.028 $\|$ 0.029 $\|$ 0.046 $\|$ 0.109	0.061 $\|$ 0.071 $\|$ 0.146 $\|$ 0.382
10000	0.053 $\|$ 0.057 $\|$ 0.091 $\|$ 0.218	0.118 $\|$ 0.141 $\|$ 0.291 $\|$ 0.763
20000	0.103 $\|$ 0.114 $\|$ 0.180 $\|$ 0.442	0.237 $\|$ 0.282 $\|$ 0.578 $\|$ 1.521
50000	0.257 $\|$ 0.287 $\|$ 0.451 $\|$ 1.112	0.587 $\|$ 0.702 $\|$ 1.459 $\|$ 3.822

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Online Alternating Direction Method of Multipliers for Online Composite Optimization††thanks: Supported by National Key R&D Program of China under project No. 2022YFA1004000.

1 Introduction

Example 1.1**.**

Example 1.2**.**

2 Key Inequalities of Online spADMM

Theorem 2.1**.**

Proposition 2.1**.**

Theorem 2.2**.**

Corollary 2.1**.**

3 Regret Analysis

Assumption 3.1**.**

Assumption 3.2**.**

Assumption 3.3**.**

3.1 Constraint and objective regrets

Lemma 3.1**.**

Proposition 3.1**.**

Proposition 3.2**.**

Theorem 3.1**.**

Corollary 3.1**.**

Remark 3.1**.**

3.2 Solution regret

Proposition 3.3**.**

Proposition 3.4**.**

Theorem 3.2**.**

Corollary 3.2**.**

4 Averaging in spADMM

4.1 Regrets of spADMM Iterations

Assumption 4.1**.**

Proposition 4.1**.**

Theorem 4.1**.**

Theorem 4.2**.**

4.2 Recovery of an important inequality in [10]

5 Examples and Numerical Evaluations

5.1 Application to online quadratic optimization

5.2 Lasso

5.3 Generalized Lasso

6 Conclusions

Declarations

Example 1.1.

Example 1.2.

Theorem 2.1.

Proposition 2.1.

Theorem 2.2.

Corollary 2.1.

Assumption 3.1.

Assumption 3.2.

Assumption 3.3.

Lemma 3.1.

Proposition 3.1.

Proposition 3.2.

Theorem 3.1.

Corollary 3.1.

Remark 3.1.

Proposition 3.3.

Proposition 3.4.

Theorem 3.2.

Corollary 3.2.

Assumption 4.1.

Proposition 4.1.

Theorem 4.1.

Theorem 4.2.