Robust Quadratic Programming with Mixed-Integer Uncertainty

Areesh Mittal; Can Gokalp; Grani A. Hanasusanto

arXiv:1706.01949·math.OC·December 19, 2018·INFORMS J. Comput.

Robust Quadratic Programming with Mixed-Integer Uncertainty

Areesh Mittal, Can Gokalp, Grani A. Hanasusanto

PDF

TL;DR

This paper develops copositive programming reformulations and SDP approximations for robust quadratic programs with mixed-integer uncertainty, providing more effective solutions than existing methods.

Contribution

It introduces exact copositive reformulations and a conservative SDP approximation for mixed-integer robust quadratic programs, extending previous continuous-only approaches.

Findings

01

SDP approximation outperforms S-lemma-based methods

02

Reformulations applicable to two-stage problems with recourse

03

Demonstrated effectiveness on practical optimization problems

Abstract

We study robust convex quadratic programs where the uncertain problem parameters can contain both continuous and integer components. Under the natural boundedness assumption on the uncertainty set, we show that the generic problems are amenable to exact copositive programming reformulations of polynomial size. These convex optimization problems are NP-hard but admit a conservative semidefinite programming (SDP) approximation that can be solved efficiently. We prove that the popular approximate S-lemma method --- which is valid only in the case of continuous uncertainty --- is weaker than our approximation. We also show that all results can be extended to the two-stage robust quadratic optimization setting if the problem has complete recourse. We assess the effectiveness of our proposed SDP reformulations and demonstrate their superiority over the state-of-the-art solution schemes on…

Tables6

Table 1. Table 1: Numerical results comparing the proposed SDP approximation (‘SDP’), the approximate 𝒮 𝒮 \mathcal{S} -lemma method (‘ 𝒮 𝒮 \mathcal{S} -lemma’) and the approximation scheme proposed by Bertsimas and Sim [ 7 ] (‘B&S’) for the least squares problem. The ‘objective gap’ quantifies the increase in the worst-case residuals estimated using the approximation methods relative to the Benders’ lower bound.

	Objective gap
Statistic	SDP	$𝒮$ -lemma	B&S
Mean	0.0%	108.4%	99.7%
10th Percentile	0.0%	93.9%	80.5%
90th Percentile	0.0%	119.6%	115.3%

Table 2. Table 2: Solution times for the Benders’ constraint generation method (‘Benders’), the proposed SDP approximation (‘SDP’), the approximate 𝒮 𝒮 \mathcal{S} -lemma method (‘ 𝒮 𝒮 \mathcal{S} -lemma’) and the approximation scheme proposed by Bertsimas and Sim [ 7 ] (‘B&S’) for the least squares problem.

	Benders	SDP	$𝒮$ -lemma	B&S
Mean solution time (in secs)	626.9	10.2	0.45	0.004

Table 3. Table 3: Numerical results for the proposed SDP approximation (‘SDP’) and the linear decision rules approximation (‘LDR’) for the project crashing problem. The ‘objective gap’ quantifies the increase in the worst-case makespan estimated using the approximation methods relative to the optimal worst-case makespan. The ‘suboptimality’ quantifies the increase in the actual worst-case makespan of the resource allocations found using the approximation methods relative to the optimal worst-case makespan.

	Objective gap		Suboptimality
Statistic	SDP	LDR	SDP	LDR
Mean	2.7%	26.9%	1.7%	10.0%
10th Percentile	2.0%	23.8%	1.3%	7.1%
90th Percentile	3.2%	30.2%	2.2%	12.8%

Table 4. Table 4: Solution times for the Benders’ constraint generation method (‘Benders’), the proposed SDP approximation (‘SDP’) and the linear decision rules approximation (‘LDR’) for the project crashing problem.

	Benders	SDP	LDR
Mean solution time (in secs)	518.0	85.0	0.16

Table 5. Table 5: Numerical results for the SDP approximations for the newsvendor model with integer uncertainty set (‘SDP True’) and the model that ignores the integrality restriction (‘SDP Cont’). The ‘objective gap’ quantifies the increase in the worst-case cost estimated using the approximation methods relative to the optimal worst-case cost. The ‘suboptimality’ quantifies the increase in the actual worst-case cost of the order quantities found using the approximation methods relative to the optimal worst-case cost.

	Objective gap		Suboptimality
Statistic	SDP True	SDP Cont	SDP True	SDP Cont
Mean	13.1%	85.2%	13.0%	84.9%
10th Percentile	0.0%	25.8%	0.0%	25.7%
90th Percentile	28.4%	173.7%	27.6%	173.5%

Table 6. Table 6: Solution times for the Benders’ constraint generation method for the newsvendor problem (‘Benders’), the SDP approximations for the model with integer uncertainty set (‘SDP True’) and the model that ignores the integrality restriction (‘SDP Cont’).

	Benders	SDP True	SDP Cont
Mean solution time (in secs)	52.9	11.3	0.63

Equations174

\begin{array}[]{clll}\displaystyle\textnormal{minimize}&\displaystyle\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ \textnormal{subject to}&\displaystyle{\bm{x}\in\mathcal{X}}.\end{array}

\begin{array}[]{clll}\displaystyle\textnormal{minimize}&\displaystyle\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ \textnormal{subject to}&\displaystyle{\bm{x}\in\mathcal{X}}.\end{array}

ξ \in Ξ sup ∥ A (x) ξ ∥^{2} + b (x)^{⊤} ξ + c (x) .

ξ \in Ξ sup ∥ A (x) ξ ∥^{2} + b (x)^{⊤} ξ + c (x) .

\begin{array}[]{clll}\displaystyle\textnormal{minimize}&\displaystyle\sup_{\bm{\xi}\in\Xi}\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ \textnormal{subject to}&\displaystyle{\bm{x}\in\mathcal{X}},\end{array}

\begin{array}[]{clll}\displaystyle\textnormal{minimize}&\displaystyle\sup_{\bm{\xi}\in\Xi}\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ \textnormal{subject to}&\displaystyle{\bm{x}\in\mathcal{X}},\end{array}

\Xi=\left\{\bm{\xi}\in\mathbb{R}_{+}^{K}:\begin{array}[]{l}\bm{S}\bm{\xi}=\bm{t}\\ \xi_{\ell}\in\mathbb{Z}\quad\forall\ell\in[L]\end{array}\right\},

\Xi=\left\{\bm{\xi}\in\mathbb{R}_{+}^{K}:\begin{array}[]{l}\bm{S}\bm{\xi}=\bm{t}\\ \xi_{\ell}\in\mathbb{Z}\quad\forall\ell\in[L]\end{array}\right\},

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\bm{x}^{\top}{\bm{\Sigma}}\bm{x}-\lambda{\bm{\mu}}^{\top}\bm{x}\\ \ \textnormal{subject to}&\displaystyle\bm{x}\in\Delta^{K},\end{array}

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\bm{x}^{\top}{\bm{\Sigma}}\bm{x}-\lambda{\bm{\mu}}^{\top}\bm{x}\\ \ \textnormal{subject to}&\displaystyle\bm{x}\in\Delta^{K},\end{array}

\hat{μ} = \frac{1}{N} n \in [N] \sum \hat{ξ}_{n} and \hat{Σ} = \frac{1}{N - 1} n \in [N] \sum (\hat{ξ}_{n} - \hat{μ}) (\hat{ξ}_{n} - \hat{μ})^{⊤} .

\hat{μ} = \frac{1}{N} n \in [N] \sum \hat{ξ}_{n} and \hat{Σ} = \frac{1}{N - 1} n \in [N] \sum (\hat{ξ}_{n} - \hat{μ}) (\hat{ξ}_{n} - \hat{μ})^{⊤} .

Ξ = ⎩ ⎨ ⎧ ((\hat{ξ}_{n})_{n \in [N]}, (\hat{χ}_{n})_{n \in [N]}) \in R_{+}^{N K + N K} : \hat{ξ}_{n} \in Ξ_{n}, \hat{χ}_{n} = \hat{ξ}_{n} - \frac{1}{N} n^{'} \in [N] \sum \hat{ξ}_{n^{'}} \forall n \in [N] ⎭ ⎬ ⎫

Ξ = ⎩ ⎨ ⎧ ((\hat{ξ}_{n})_{n \in [N]}, (\hat{χ}_{n})_{n \in [N]}) \in R_{+}^{N K + N K} : \hat{ξ}_{n} \in Ξ_{n}, \hat{χ}_{n} = \hat{ξ}_{n} - \frac{1}{N} n^{'} \in [N] \sum \hat{ξ}_{n^{'}} \forall n \in [N] ⎭ ⎬ ⎫

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\sup_{\left((\hat{\bm{\xi}}_{n})_{n},(\hat{\bm{\chi}}_{n})_{n}\right)\in\Xi}\left(\frac{1}{N-1}\sum_{n\in[N]}(\hat{\bm{\chi}}_{n}^{\top}\bm{x})^{2}-\frac{\lambda}{N}\sum_{n\in[N]}\hat{\bm{\xi}}_{n}^{\top}\bm{x}\right)\\ \textnormal{subject to}&\bm{x}\in\Delta^{K}.\end{array}

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\sup_{\left((\hat{\bm{\xi}}_{n})_{n},(\hat{\bm{\chi}}_{n})_{n}\right)\in\Xi}\left(\frac{1}{N-1}\sum_{n\in[N]}(\hat{\bm{\chi}}_{n}^{\top}\bm{x})^{2}-\frac{\lambda}{N}\sum_{n\in[N]}\hat{\bm{\xi}}_{n}^{\top}\bm{x}\right)\\ \textnormal{subject to}&\bm{x}\in\Delta^{K}.\end{array}

A (x) = \frac{1}{N - 1} 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ 0^{⊤} \dots ⋱ \dots \dots ⋱ \dots 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ 0^{⊤} x^{⊤} ⋮ 0^{⊤} \dots ⋱ \dots \dots ⋱ \dots 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ x^{⊤}, b (x) = - \frac{λ}{N} x ⋮ x 0 ⋮ 0, and c (x) = 0.

A (x) = \frac{1}{N - 1} 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ 0^{⊤} \dots ⋱ \dots \dots ⋱ \dots 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ 0^{⊤} x^{⊤} ⋮ 0^{⊤} \dots ⋱ \dots \dots ⋱ \dots 0^{⊤} ⋮ 0^{⊤} 0^{⊤} ⋮ x^{⊤}, b (x) = - \frac{λ}{N} x ⋮ x 0 ⋮ 0, and c (x) = 0.

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\sup_{\bm{z}\in\mathcal{Z}}\;\sum_{(i,j)\in\mathcal{A}}({d}_{ij}-x_{ij})z_{ij}\\ \textnormal{subject to}&\displaystyle\bm{x}\in\mathcal{X},\end{array}

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\sup_{\bm{z}\in\mathcal{Z}}\;\sum_{(i,j)\in\mathcal{A}}({d}_{ij}-x_{ij})z_{ij}\\ \textnormal{subject to}&\displaystyle\bm{x}\in\mathcal{X},\end{array}

\mathcal{Z}=\left\{\bm{z}\in\{0,1\}^{|\mathcal{A}|}:\sum_{j:(i,j)\in\mathcal{A}}z_{ij}-\sum_{j:(j,i)\in\mathcal{A}}z_{ji}=\left\{\begin{array}[]{ll}1&\text{if }i=1\\ -1&\text{if }i=|\mathcal{V}|\\ 0&\text{if otherwise}\end{array}\right.,\quad\forall i\in\mathcal{V}\right\}.

\mathcal{Z}=\left\{\bm{z}\in\{0,1\}^{|\mathcal{A}|}:\sum_{j:(i,j)\in\mathcal{A}}z_{ij}-\sum_{j:(j,i)\in\mathcal{A}}z_{ji}=\left\{\begin{array}[]{ll}1&\text{if }i=1\\ -1&\text{if }i=|\mathcal{V}|\\ 0&\text{if otherwise}\end{array}\right.,\quad\forall i\in\mathcal{V}\right\}.

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\sup_{\bm{d}\in\mathcal{D}}\left(\sup_{\bm{z}\in\mathcal{Z}}\;\sum_{(i,j)\in\mathcal{A}}({d}_{ij}-x_{ij})z_{ij}\right)\\ \textnormal{subject to}&\displaystyle\bm{x}\in\mathcal{X}.\end{array}

\begin{array}[]{clll}\textnormal{minimize}&\displaystyle\sup_{\bm{d}\in\mathcal{D}}\left(\sup_{\bm{z}\in\mathcal{Z}}\;\sum_{(i,j)\in\mathcal{A}}({d}_{ij}-x_{ij})z_{ij}\right)\\ \textnormal{subject to}&\displaystyle\bm{x}\in\mathcal{X}.\end{array}

\begin{array}[]{rlll}\displaystyle\sup_{\bm{d}\in\mathcal{D}}\sup_{\bm{z}\in\mathcal{Z}}\;\sum_{(i,j)\in\mathcal{A}}({d}_{ij}-x_{ij})z_{ij}=&\displaystyle\sup_{(\bm{d},\bm{z},\bm{q})\in\Xi}\;\mathbf{e}^{\top}\bm{q}-\bm{x}^{\top}\bm{z},\end{array}

\begin{array}[]{rlll}\displaystyle\sup_{\bm{d}\in\mathcal{D}}\sup_{\bm{z}\in\mathcal{Z}}\;\sum_{(i,j)\in\mathcal{A}}({d}_{ij}-x_{ij})z_{ij}=&\displaystyle\sup_{(\bm{d},\bm{z},\bm{q})\in\Xi}\;\mathbf{e}^{\top}\bm{q}-\bm{x}^{\top}\bm{z},\end{array}

Ξ = {(d, z, q) \in D \times Z \times R_{+}^{∣ A ∣} : q \leq z, q \leq d, q \geq d - e + z} .

Ξ = {(d, z, q) \in D \times Z \times R_{+}^{∣ A ∣} : q \leq z, q \leq d, q \geq d - e + z} .

Z (x) = ξ \in Ξ sup ∥ A (x) ξ ∥^{2} + b (x)^{⊤} ξ + c (x),

Z (x) = ξ \in Ξ sup ∥ A (x) ξ ∥^{2} + b (x)^{⊤} ξ + c (x),

\begin{array}[]{clll}\displaystyle\textnormal{minimize}&\displaystyle Z(\bm{x})\\ \textnormal{subject to}&\displaystyle{\bm{x}\in\mathcal{X}}.\end{array}

\begin{array}[]{clll}\displaystyle\textnormal{minimize}&\displaystyle Z(\bm{x})\\ \textnormal{subject to}&\displaystyle{\bm{x}\in\mathcal{X}}.\end{array}

\begin{array}[]{clll}\displaystyle\textnormal{maximize}&\displaystyle\bm{\xi}^{\top}\bm{Q}\bm{\xi}+\bm{r}^{\top}\bm{\xi}\\ \textnormal{subject to}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{P}\\ &\displaystyle\bm{F}\bm{\xi}=\bm{g}\\ &\displaystyle\xi_{\ell}\in\{0,1\}&\forall\ell\in\mathcal{L}\end{array}

\begin{array}[]{clll}\displaystyle\textnormal{maximize}&\displaystyle\bm{\xi}^{\top}\bm{Q}\bm{\xi}+\bm{r}^{\top}\bm{\xi}\\ \textnormal{subject to}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{P}\\ &\displaystyle\bm{F}\bm{\xi}=\bm{g}\\ &\displaystyle\xi_{\ell}\in\{0,1\}&\forall\ell\in\mathcal{L}\end{array}

\begin{array}[]{clll}\displaystyle\textnormal{maximize}&\displaystyle\textup{tr}(\bm{\Omega}\bm{Q})+\bm{r}^{\top}\bm{\xi}\\ \textnormal{subject to}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{P},\;\bm{\Omega}\in\mathbb{S}_{+}^{P}\\ &\displaystyle\bm{F}\bm{\xi}=\bm{g},\;\;\operatorname*{diag}(\bm{F}\bm{\Omega}\bm{F}^{\top})=\bm{g}\circ\bm{g}\\ &\displaystyle\xi_{\ell}=\Omega_{\ell\ell}\qquad\forall\ell\in\mathcal{L}\\ &\begin{bmatrix}\bm{\Omega}&\bm{\xi}\\ \bm{\xi}^{\top}&1\\ \end{bmatrix}\succeq_{\mathcal{C}^{*}}\bm{0},\end{array}

\begin{array}[]{clll}\displaystyle\textnormal{maximize}&\displaystyle\textup{tr}(\bm{\Omega}\bm{Q})+\bm{r}^{\top}\bm{\xi}\\ \textnormal{subject to}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{P},\;\bm{\Omega}\in\mathbb{S}_{+}^{P}\\ &\displaystyle\bm{F}\bm{\xi}=\bm{g},\;\;\operatorname*{diag}(\bm{F}\bm{\Omega}\bm{F}^{\top})=\bm{g}\circ\bm{g}\\ &\displaystyle\xi_{\ell}=\Omega_{\ell\ell}\qquad\forall\ell\in\mathcal{L}\\ &\begin{bmatrix}\bm{\Omega}&\bm{\xi}\\ \bm{\xi}^{\top}&1\\ \end{bmatrix}\succeq_{\mathcal{C}^{*}}\bm{0},\end{array}

ξ = q \in [Q] \sum 2^{q - 1} χ_{q} = v_{Q}^{⊤} χ .

ξ = q \in [Q] \sum 2^{q - 1} χ_{q} = v_{Q}^{⊤} χ .

\begin{array}[]{clll}\displaystyle Z(\bm{x})=&\displaystyle\sup&\textup{tr}\left({\bm{\mathcal{A}}}(\bm{x})\bm{\Omega}{\bm{\mathcal{A}}}(\bm{x})^{\top}\right)+{\bm{\mathscr{b}}}(\bm{x})^{\top}{\bm{\xi}}^{\prime}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle{{\bm{\xi}}^{\prime}}\in\mathbb{R}_{+}^{{{K^{\prime}}}},\;\bm{\Omega}\in\mathbb{S}_{+}^{{{K^{\prime}}}}\\ &&{\bm{\mathcal{S}}}{\bm{\xi}}^{\prime}={\bm{\mathscr{t}}},\;\operatorname*{diag}({\bm{\mathcal{S}}}\bm{\Omega}{\bm{\mathcal{S}}}^{\top})={\bm{\mathscr{t}}}\circ{\bm{\mathscr{t}}}\\ &&{\xi}_{\ell}^{\prime}=\Omega_{\ell\ell}\quad\forall\ell\in\left[LQ\right]\\ &&\begin{bmatrix}\bm{\Omega}&{\bm{\xi}}^{\prime}\\ {\bm{\xi}}^{\prime\top}&1\\ \end{bmatrix}\succeq_{\mathcal{C}^{*}}\bm{0},\end{array}

\begin{array}[]{clll}\displaystyle Z(\bm{x})=&\displaystyle\sup&\textup{tr}\left({\bm{\mathcal{A}}}(\bm{x})\bm{\Omega}{\bm{\mathcal{A}}}(\bm{x})^{\top}\right)+{\bm{\mathscr{b}}}(\bm{x})^{\top}{\bm{\xi}}^{\prime}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle{{\bm{\xi}}^{\prime}}\in\mathbb{R}_{+}^{{{K^{\prime}}}},\;\bm{\Omega}\in\mathbb{S}_{+}^{{{K^{\prime}}}}\\ &&{\bm{\mathcal{S}}}{\bm{\xi}}^{\prime}={\bm{\mathscr{t}}},\;\operatorname*{diag}({\bm{\mathcal{S}}}\bm{\Omega}{\bm{\mathcal{S}}}^{\top})={\bm{\mathscr{t}}}\circ{\bm{\mathscr{t}}}\\ &&{\xi}_{\ell}^{\prime}=\Omega_{\ell\ell}\quad\forall\ell\in\left[LQ\right]\\ &&\begin{bmatrix}\bm{\Omega}&{\bm{\xi}}^{\prime}\\ {\bm{\xi}}^{\prime\top}&1\\ \end{bmatrix}\succeq_{\mathcal{C}^{*}}\bm{0},\end{array}

\begin{array}[]{c}{\bm{\mathcal{S}}}=\begin{bmatrix}\bm{0}&\cdots&\bm{0}&\bm{0}&\cdots&\bm{0}&\bm{S}\\ -\mathbf{v}_{Q}^{\top}&\cdots&\bm{0}^{\top}&\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\mathbf{e}_{1}^{\top}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots&\vdots\\ \bm{0}^{\top}&\cdots&-\mathbf{v}_{Q}^{\top}&\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\mathbf{e}_{L}^{\top}\\ \mathbb{I}&\cdots&\bm{0}&\mathbb{I}&\cdots&\bm{0}&\bm{0}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots&\vdots\\ \bm{0}&\cdots&\mathbb{I}&\bm{0}&\cdots&\mathbb{I}&\bm{0}\\ \end{bmatrix}\in\mathbb{R}^{{J^{\prime}}\times{K^{\prime}}},\quad{\bm{\mathscr{t}}}=\begin{bmatrix}\bm{t}\\ 0\\ \vdots\\ 0\\ \mathbf{e}\\ \vdots\\ \mathbf{e}\\ \end{bmatrix}\in\mathbb{R}^{{J^{\prime}}},\\ \quad\quad{\bm{\mathcal{A}}}(\bm{x})=\begin{bmatrix}\bm{0}&\cdots&\bm{0}&\bm{0}&\cdots&\bm{0}&\bm{A}(\bm{x})\end{bmatrix}\in\mathbb{R}^{M\times{K^{\prime}}}\quad\text{ and }\\ \quad{\bm{\mathscr{b}}}(\bm{x})=\begin{bmatrix}\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\bm{b}(\bm{x})^{\top}\end{bmatrix}^{\top}\in\mathbb{R}^{{K^{\prime}}},\end{array}

\begin{array}[]{c}{\bm{\mathcal{S}}}=\begin{bmatrix}\bm{0}&\cdots&\bm{0}&\bm{0}&\cdots&\bm{0}&\bm{S}\\ -\mathbf{v}_{Q}^{\top}&\cdots&\bm{0}^{\top}&\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\mathbf{e}_{1}^{\top}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots&\vdots\\ \bm{0}^{\top}&\cdots&-\mathbf{v}_{Q}^{\top}&\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\mathbf{e}_{L}^{\top}\\ \mathbb{I}&\cdots&\bm{0}&\mathbb{I}&\cdots&\bm{0}&\bm{0}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots&\vdots\\ \bm{0}&\cdots&\mathbb{I}&\bm{0}&\cdots&\mathbb{I}&\bm{0}\\ \end{bmatrix}\in\mathbb{R}^{{J^{\prime}}\times{K^{\prime}}},\quad{\bm{\mathscr{t}}}=\begin{bmatrix}\bm{t}\\ 0\\ \vdots\\ 0\\ \mathbf{e}\\ \vdots\\ \mathbf{e}\\ \end{bmatrix}\in\mathbb{R}^{{J^{\prime}}},\\ \quad\quad{\bm{\mathcal{A}}}(\bm{x})=\begin{bmatrix}\bm{0}&\cdots&\bm{0}&\bm{0}&\cdots&\bm{0}&\bm{A}(\bm{x})\end{bmatrix}\in\mathbb{R}^{M\times{K^{\prime}}}\quad\text{ and }\\ \quad{\bm{\mathscr{b}}}(\bm{x})=\begin{bmatrix}\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\bm{0}^{\top}&\cdots&\bm{0}^{\top}&\bm{b}(\bm{x})^{\top}\end{bmatrix}^{\top}\in\mathbb{R}^{{K^{\prime}}},\end{array}

J^{'} = L Q + J + L and K^{'} = 2 L Q + K .

J^{'} = L Q + J + L and K^{'} = 2 L Q + K .

\begin{array}[]{clll}Z(\bm{x})=&\displaystyle\sup&\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{K},\;\bm{\chi}_{\ell}\in\{0,1\}^{Q}&\forall\ell\in[L]\\ &&\bm{S}\bm{\xi}=\bm{t}\\ &&\displaystyle\xi_{\ell}=\mathbf{v}_{Q}^{\top}\bm{\chi}_{\ell}&\forall\ell\in[L].\end{array}

\begin{array}[]{clll}Z(\bm{x})=&\displaystyle\sup&\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{K},\;\bm{\chi}_{\ell}\in\{0,1\}^{Q}&\forall\ell\in[L]\\ &&\bm{S}\bm{\xi}=\bm{t}\\ &&\displaystyle\xi_{\ell}=\mathbf{v}_{Q}^{\top}\bm{\chi}_{\ell}&\forall\ell\in[L].\end{array}

\begin{array}[]{clll}Z(\bm{x})=&\displaystyle\sup&\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{K},\;\bm{\eta}_{\ell}\in\mathbb{R}_{+}^{Q},\;\bm{\chi}_{\ell}\in\{0,1\}^{Q}&\forall\ell\in[L]\\ &&\bm{S}\bm{\xi}=\bm{t}\\ &&\displaystyle\xi_{\ell}=\mathbf{v}_{Q}^{\top}\bm{\chi}_{\ell}&\forall\ell\in[L]\\ &&\bm{\chi}_{\ell}+\bm{\eta}_{\ell}=\mathbf{e}&\forall\ell\in[L].\end{array}

\begin{array}[]{clll}Z(\bm{x})=&\displaystyle\sup&\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle\bm{\xi}\in\mathbb{R}_{+}^{K},\;\bm{\eta}_{\ell}\in\mathbb{R}_{+}^{Q},\;\bm{\chi}_{\ell}\in\{0,1\}^{Q}&\forall\ell\in[L]\\ &&\bm{S}\bm{\xi}=\bm{t}\\ &&\displaystyle\xi_{\ell}=\mathbf{v}_{Q}^{\top}\bm{\chi}_{\ell}&\forall\ell\in[L]\\ &&\bm{\chi}_{\ell}+\bm{\eta}_{\ell}=\mathbf{e}&\forall\ell\in[L].\end{array}

ξ^{'} = [χ_{1}^{⊤} \dots χ_{L}^{⊤} η_{1}^{⊤} \dots η_{L}^{⊤} ξ^{⊤}]^{⊤} \in R_{+}^{K^{'}}

ξ^{'} = [χ_{1}^{⊤} \dots χ_{L}^{⊤} η_{1}^{⊤} \dots η_{L}^{⊤} ξ^{⊤}]^{⊤} \in R_{+}^{K^{'}}

\begin{array}[]{clll}Z(\bm{x})=&\displaystyle\sup&\left\lVert\bm{{\bm{\mathcal{A}}}}(\bm{x}){\bm{\xi}}^{\prime}\right\rVert^{2}+\bm{{\bm{\mathscr{b}}}}(\bm{x})^{\top}{{\bm{\xi}}^{\prime}}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle{{{\bm{\xi}}^{\prime}}}\in\mathbb{R}_{+}^{{K^{\prime}}}\\ &&{\bm{\mathcal{S}}}{{{\bm{\xi}}^{\prime}}}={\bm{\mathscr{t}}}\\ &&\xi_{\ell}^{\prime}\in\{0,1\}&\forall\ell\in\left[LQ\right].\end{array}

\begin{array}[]{clll}Z(\bm{x})=&\displaystyle\sup&\left\lVert\bm{{\bm{\mathcal{A}}}}(\bm{x}){\bm{\xi}}^{\prime}\right\rVert^{2}+\bm{{\bm{\mathscr{b}}}}(\bm{x})^{\top}{{\bm{\xi}}^{\prime}}+c(\bm{x})\\ &\textnormal{s.t.}&\displaystyle{{{\bm{\xi}}^{\prime}}}\in\mathbb{R}_{+}^{{K^{\prime}}}\\ &&{\bm{\mathcal{S}}}{{{\bm{\xi}}^{\prime}}}={\bm{\mathscr{t}}}\\ &&\xi_{\ell}^{\prime}\in\{0,1\}&\forall\ell\in\left[LQ\right].\end{array}

\begin{array}[]{clll}\displaystyle\overline{Z}(\bm{x})=&\inf&c(\bm{x})+{\bm{\mathscr{t}}}^{\top}\bm{\psi}+({\bm{\mathscr{t}}}\circ{\bm{\mathscr{t}}})^{\top}\bm{\phi}+\tau\\ &\textnormal{s.t.}&\displaystyle\tau\in\mathbb{R},\;\bm{\psi},\bm{\phi}\in\mathbb{R}^{{J^{\prime}}},\;\bm{\gamma}\in\mathbb{R}^{LQ}\\ &&\begin{bmatrix}{\bm{\mathcal{S}}}^{\top}\operatorname*{diag}(\bm{\phi}){\bm{\mathcal{S}}}-{\bm{\mathcal{A}}}(\bm{x})^{\top}{\bm{\mathcal{A}}}(\bm{x})-\operatorname*{diag}\left([\bm{\gamma}^{\top}\;\bm{0}^{\top}]^{\top}\right)&\frac{1}{2}\left({\bm{\mathcal{S}}}^{\top}\bm{\psi}-{\bm{\mathscr{b}}}(\bm{x})+[\bm{\gamma}^{\top}\;\bm{0}^{\top}]^{\top}\right)\\ \frac{1}{2}\left({\bm{\mathcal{S}}}^{\top}\bm{\psi}-{\bm{\mathscr{b}}}(\bm{x})+[\bm{\gamma}^{\top}\;\bm{0}^{\top}]^{\top}\right)^{\top}&\tau\end{bmatrix}\succeq_{\mathcal{C}}\bm{0}.\end{array}

\begin{array}[]{clll}\displaystyle\overline{Z}(\bm{x})=&\inf&c(\bm{x})+{\bm{\mathscr{t}}}^{\top}\bm{\psi}+({\bm{\mathscr{t}}}\circ{\bm{\mathscr{t}}})^{\top}\bm{\phi}+\tau\\ &\textnormal{s.t.}&\displaystyle\tau\in\mathbb{R},\;\bm{\psi},\bm{\phi}\in\mathbb{R}^{{J^{\prime}}},\;\bm{\gamma}\in\mathbb{R}^{LQ}\\ &&\begin{bmatrix}{\bm{\mathcal{S}}}^{\top}\operatorname*{diag}(\bm{\phi}){\bm{\mathcal{S}}}-{\bm{\mathcal{A}}}(\bm{x})^{\top}{\bm{\mathcal{A}}}(\bm{x})-\operatorname*{diag}\left([\bm{\gamma}^{\top}\;\bm{0}^{\top}]^{\top}\right)&\frac{1}{2}\left({\bm{\mathcal{S}}}^{\top}\bm{\psi}-{\bm{\mathscr{b}}}(\bm{x})+[\bm{\gamma}^{\top}\;\bm{0}^{\top}]^{\top}\right)\\ \frac{1}{2}\left({\bm{\mathcal{S}}}^{\top}\bm{\psi}-{\bm{\mathscr{b}}}(\bm{x})+[\bm{\gamma}^{\top}\;\bm{0}^{\top}]^{\top}\right)^{\top}&\tau\end{bmatrix}\succeq_{\mathcal{C}}\bm{0}.\end{array}

Ξ^{'} = {ξ^{'} \in R^{K^{'}} : S ξ^{'} = t, ξ^{'} \geq 0}

Ξ^{'} = {ξ^{'} \in R^{K^{'}} : S ξ^{'} = t, ξ^{'} \geq 0}

M = [P Q^{⊤} Q R] .

M = [P Q^{⊤} Q R] .

\begin{array}[]{rll}[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]\bm{M}[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]^{\top}&=&\displaystyle\bm{\xi}^{\top}\bm{P}\bm{\xi}+2\bm{\xi}^{\top}\bm{Q}\bm{\rho}+\bm{\rho}^{\top}\bm{R}\bm{\rho}\\ &=&\displaystyle(\bm{\xi}+\bm{P}^{-1}\bm{Q}\bm{\rho})^{\top}\bm{P}(\bm{\xi}+\bm{P}^{-1}\bm{Q}\bm{\rho})+\bm{\rho}^{\top}(\bm{R}-\bm{Q}^{\top}\bm{P}^{-1}\bm{Q})\bm{\rho}\ \geq\ 0.\end{array}

\begin{array}[]{rll}[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]\bm{M}[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]^{\top}&=&\displaystyle\bm{\xi}^{\top}\bm{P}\bm{\xi}+2\bm{\xi}^{\top}\bm{Q}\bm{\rho}+\bm{\rho}^{\top}\bm{R}\bm{\rho}\\ &=&\displaystyle(\bm{\xi}+\bm{P}^{-1}\bm{Q}\bm{\rho})^{\top}\bm{P}(\bm{\xi}+\bm{P}^{-1}\bm{Q}\bm{\rho})+\bm{\rho}^{\top}(\bm{R}-\bm{Q}^{\top}\bm{P}^{-1}\bm{Q})\bm{\rho}\ \geq\ 0.\end{array}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\newfloatcommand

capbtabboxtable[][\FBwidth]

Robust Quadratic Programming with Mixed-Integer Uncertainty

Areesh Mittal

Can Gokalp

Grani A. Hanasusanto

Abstract

We study robust convex quadratic programs where the uncertain problem parameters can contain both continuous and integer components. Under the natural boundedness assumption on the uncertainty set, we show that the generic problems are amenable to exact copositive programming reformulations of polynomial size. These convex optimization problems are NP-hard but admit a conservative semidefinite programming (SDP) approximation that can be solved efficiently. We prove that the popular approximate $\mathcal{S}$ -lemma method—which is valid only in the case of continuous uncertainty—is weaker than our approximation. We also show that all results can be extended to the two-stage robust quadratic optimization setting if the problem has complete recourse. We assess the effectiveness of our proposed SDP reformulations and demonstrate their superiority over the state-of-the-art solution schemes on instances of least squares, project management, and multi-item newsvendor problems.

1 Introduction

A wide variety of decision making problems in engineering, physical, or economic systems can be formulated as convex quadratic programs of the form

[TABLE]

Here, $\mathcal{X}\subseteq\mathbb{R}^{D}$ is the feasible set of the decision vector $\bm{x}$ and is assumed to be described by a polytope, $\bm{\xi}\in\mathbb{R}^{K}$ is a vector of exogenous problem parameters, $\bm{A}(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}^{M\times K}$ and $\bm{b}(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}^{K}$ are matrix- and vector-valued affine functions, respectively, while $c(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}$ is a convex quadratic function. The objective of problem (1) is to determine the best decision $\bm{x}\in\mathcal{X}$ that minimizes the quadratic function $\left\lVert\bm{A}(\bm{x})\bm{\xi}\right\rVert^{2}+\bm{b}(\bm{x})^{\top}\bm{\xi}+c(\bm{x})$ . The generic formulation (1) includes the class of linear programming problems [42] as a special case (when $\bm{A}=\bm{0}$ ), and has numerous important applications, e.g., in portfolio optimization [37], least squares regression [26], supervised classification [15], optimal control [41], etc. In addition to their exceptional modeling power, quadratic optimization problems of the form (1) are attractive as they can be solved efficiently using standard off-the-shelf solvers.

In many situations of practical interest, the exact values of the parameters $\bm{\xi}$ are unknown when the decisions are made and can only be estimated through limited historical data. Thus, they are subject to potentially significant errors that can adversely impact the out-of-sample performance of an optimal solution $\bm{x}$ . One popular approach to address decision problems under uncertainty is via robust optimization [2]. In this setting, we assume that the vector of uncertain parameters $\bm{\xi}$ lies within a prescribed uncertainty set $\Xi$ and we replace the objective function of (1) with the worst-case function given by

[TABLE]

This optimization problem yields a solution $\bm{x}\in\mathcal{X}$ that minimizes the quadratic objective function under the most adverse uncertain parameter realization $\bm{\xi}\in\Xi$ .

Robust optimization models are appealing as they require minimal assumptions on the description of the uncertain parameters and because they often lead to efficient solution schemes. In a linear programming setting, the resulting robust optimization problems are tractable for many relevant uncertainty sets and have been broadly applied to problems in engineering, finance, machine learning, and operations management [4, 6, 27]. Tractable reformulations for robust quadratic programming problems are derived in [25, 36] for the particular case when the quadratic functions (in $\bm{x}$ ) exhibit a concave dependency in the uncertain parameters $\bm{\xi}$ . When the functions are convex in both $\bm{x}$ and $\bm{\xi}$ as we consider in this paper, the corresponding robust problems are generically NP-hard if the uncertainty set is defined by a polytope, but become tractable—by virtue of the exact $\mathcal{S}$ -lemma—if the uncertainty set is defined by an ellipsoid [4, 23]. Tractable approximation schemes have also been proposed for the standard setting that we consider in this paper. If the uncertainty set is described by a finite intersection of ellipsoids then a conservative semidefinite programming (SDP) reformulation is obtained by leveraging the approximate $\mathcal{S}$ -lemma [5]. In [7], a special class of functions is introduced to approximate the quadratic terms in (2). The arising robust optimization problems are tractable if the uncertainty sets are defined through affinely transformed norm balls. In [36], conservative and progressive SDP approximations are devised by replacing each quadratic term in (2) with linear upper and lower bounds, respectively.

Most of the existing literature in robust optimization assume that the uncertain problem parameters are continuous and reside in a tractable conic representable set $\Xi$ . However, certain applications require the use of mixed-integer uncertainty. Such decision problems arise prominently in the supply chain context where demands of non-perishable products are more naturally represented as integer quantities and in the discrete choice modeling context where the outcomes are chosen from a discrete set of alternatives. Other pertinent examples include robust optimization applications in logistic regression [43], classification problems with noisy labels [13, 51] and network optimization [1, 48]. If the uncertain parameters contain mixed-integer components then the problem becomes computationally formidable even in the simplest setting. Specifically, if all functions are affine in $\bm{\xi}$ and the uncertain problem parameters are described by binary vectors, then computing the worst-case values in (2) is already NP-hard [21]. The corresponding robust version of (1) is tractable only in a few contrived situations, e.g., when the uncertainty set possesses a total unimodularity property or is described by the convex hull of polynomially many integer vectors [4]. Perhaps due to these limitations, there are currently very few results in the literature that provide a systematic and rigorous way to handle generic robust optimization problems with mixed-integer uncertainty. In this paper, we first reformulate the original problem as an equivalent finite-dimensional conic program of polynomial size, which absorbs all the difficulty in its cone, and then replace the cone with tractable inner approximations. An alternate way to handle integer uncertain parameters can be to solve the problem by simply ignoring the integrality assumption. However, doing so adds undesired conservativeness to the uncertainty set. Indeed, in our numerical experiments, we demonstrate that ignoring the integrality assumption on the uncertain parameters leads to overly conservative solutions.

Optimization problems under uncertainty may also involve adaptive recourse decisions which are taken once the uncertain parameters are realized [2, 46]. This setting gives rise to difficult min-max-min optimization problems which are generically NP-hard even if both the first- and the second-stage cost functions are affine in $\bm{x}$ and $\bm{\xi}$ [3]. Thus, they can only be solved approximately, either by employing discretization schemes which approximate the continuum of the uncertainty space with finitely many points [28, 31, 45] or by employing decision rule methods, which restrict the set of all possible recourse decisions to simpler parametric forms in $\bm{\xi}$ [3, 22, 24]. We refer the reader to [17] for a comprehensive review of recent results in adaptive robust optimization. In this paper, we consider two-stage robust optimization problems with quadratic first- and second-stage objective function and a mixed-integer uncertainty set. We show that if the problem has complete recourse, then it can be reformulated as a conic program—which is amenable to tractable approximations.

The conic programming route that we take here to model optimization problems under uncertainty has previously been traversed. In [39], completely positive programming reformulations are derived to compute best-case expectations of mixed zero-one linear programs under first- and second-order moment information on the joint distributions of the uncertain parameters. This result has been extended and applied to other pertinent settings such as in stochastic appointment scheduling problems, discrete choice models, random walks and sequencing problems [32, 34, 38]. Recently, equivalent copositive programming reformulations are derived for generic two-stage robust linear programs [29, 50]. The resulting optimization problems are amenable to conservative semidefinite programming reformulations which are often stronger than the ones obtained from employing quadratic decision rules on the recourse function. In [20], the authors provide completely positive reformulation for a two-stage distributionally robust supply chain risk mitigation problem. They allow some components of $\bm{\xi}$ to be binary, but assume precise knowledge of the first- and the second-order moments of the distribution of $\bm{\xi}$ . The objective function that they consider is quadratic in the second-stage decision variables but affine in $\bm{\xi}$ . In contrast to [20], we assume no information about the distribution of $\bm{\xi}$ , other than the support. Furthermore, we allow the objective function to be quadratic in the decision variables, as well as in $\bm{\xi}$ , which helps us model a more general class of robust problems, e.g., robust least squares [23].

In this paper, we advance the state-of-the-art in robust optimization along several directions. We summarize our main contributions as follows:

We prove that any robust convex quadratic program can be reformulated as a copositive program of polynomial size if the uncertainty set is given by a bounded mixed-integer polytope. We further show that the exactness result can be extended to the two-stage robust quadratic optimization setting if the problem has complete recourse. 2. 2.

By employing the hierarchies of semidefinite representable cones to approximate the copositive cones, we obtain sequences of tractable conservative approximations for the robust problem. These approximations can be made to have any arbitrary accuracy. We prove that even the simplest of these approximations is stronger than the well-known approximate $\mathcal{S}$ -lemma method if the problem instance has only continuous uncertain parameters. Furthermore, when some uncertain parameters are restricted to take integer values, the approximate $\mathcal{S}$ -lemma method is not applicable, while our method still generates a high-quality conservative solution. 3. 3.

We compare our approximation method to other state-of-the-art approximation schemes through extensive numerical experiments. We show that our approximation method generates better estimates of worst-case cost and yields less conservative solutions. We also demonstrate that ignoring the integrality assumption on the uncertainty set may lead to inferior solutions to the robust problem. 4. 4.

To the best of our knowledge, we are the first to provide an exact conic programming reformulation and to propose tractable semidefinite programming approximations for well-established classes of one-stage and two-stage robust quadratic programs.

The remainder of the paper is structured as follows. We formulate and discuss the generic robust quadratic programs in Section 2. We then derive the copositive programming reformulation in Section 3. Section 4 develops a conservative SDP reformulation and provides a theoretical comparison with the popular approximate $\mathcal{S}$ -lemma method. In Section 5, we extend the results of Section 3 along several directions including two-stage robust quadratic optimization. We demonstrate the impact of our proposed reformulation via numerical experiments in Section 6, and finally, we conclude in Section 7.

Notation:

We use $\mathbb{Z}\ (\mathbb{Z}_{+})$ to denote the set of (non-negative) integers. For any positive integer $I$ , we use $[I]$ to denote the index set $\{1,\dots,I\}$ . We use $\left\lVert.\right\rVert_{p}$ to denote the $l_{p}$ -norm. We drop the subscript and write $\left\lVert.\right\rVert$ when referring to the $l_{2}$ -norm. The identity matrix and the vector of all ones are denoted by $\mathbb{I}$ and $\mathbf{e}$ , respectively. The dimension of such matrices will be clear from the context. We denote by $\textup{tr}(\bm{M})$ the trace of a square matrix $\bm{M}$ . For a vector $\bm{v}$ , $\operatorname*{diag}(\bm{v})$ denotes the diagonal matrix with $\bm{v}$ on its diagonal; whereas for a square matrix $\bm{M}$ , $\operatorname*{diag}(\bm{M})$ denotes the vector comprising the diagonal elements of $\bm{M}$ . We define $\bm{P}\circ\bm{Q}$ as the Hadamard product (element-wise product) of two matrices $\bm{P}$ and $\bm{Q}$ of the same size. For any integer $Q\in\mathbb{Z}_{+}$ , we define $\mathbf{v}_{Q}=[2^{0}\;2^{1}\;\cdots\;2^{Q-1}]^{\top}$ as the vector comprising all $q$ -th powers of 2, for $q=0,1,\ldots,Q-1$ . We define by $\mathbb{S}^{K}$ ( $\mathbb{S}_{+}^{K}$ ) the space of all symmetric (positive semidefinite) matrices in $\mathbb{R}^{K\times K}$ . The cone of copositive matrices is denoted by $\mathcal{C}=\{\bm{M}\in\mathbb{S}^{K}:\bm{\xi}^{\top}\bm{M}\bm{\xi}\geq 0\;\forall\bm{\xi}\geq\bm{0}\}$ , while its dual cone, the cone of completely positive matrices, is denoted by $\mathcal{C}^{*}=\{\bm{M}\in\mathbb{S}^{K}:\bm{M}=\bm{B}\bm{B}^{\top}\text{ for some }\bm{B}\in\mathbb{R}_{+}^{K\times g(K)}\}$ , where $g(K)=\max\{{K+1\choose 2}-4,K\}$ [44]. For any $\bm{P},\bm{Q}\in\mathbb{S}^{K}$ , the relations $\bm{P}\succeq\bm{Q}$ , $\bm{P}\succeq_{\mathcal{C}}\bm{Q}$ , and $\bm{P}\succeq_{\mathcal{C}^{*}}\bm{Q}$ indicate that $\bm{P}-\bm{Q}$ is an element of $\mathbb{S}_{+}^{K}$ , $\mathcal{C}$ , and $\mathcal{C}^{*}$ , respectively.

2 Problem Formulation

We study robust convex quadratic programs (RQPs) of the form

[TABLE]

where the set $\mathcal{X}$ and the functions $\bm{A}(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}^{M\times K}$ , $\bm{b}(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}^{K}$ , and $c(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}$ have the same definitions as those in (1). The vector $\bm{\xi}\in\mathbb{R}^{K}$ comprises all the uncertain problem parameters and is assumed to belong to the uncertainty set $\Xi$ given by a bounded mixed-integer polyhedral set

[TABLE]

where $\bm{S}\in\mathbb{R}^{J\times K}$ and $\bm{t}\in\mathbb{R}^{J}$ . We assume without loss of generality that the first $L$ elements of $\bm{\xi}$ are integer, while the remaining $K-L$ are continuous. Since $\Xi$ is bounded, we may further assume that there exists a scalar integer $Q\in\mathbb{Z}_{+}$ such that $\xi_{l}\in\{0,\cdots,2^{Q}-1\}$ for every $\ell\in[L]$ . Note that the quantity $Q$ is bounded by a polynomial function in the bit length of the description of $\bm{S}$ and $\bm{t}$ .

Example 1 (Robust Portfolio Optimization).

Consider the classical Markowitz mean-variance portfolio optimization problem

[TABLE]

where $\Delta^{K}$ is the unit simplex in $\mathbb{R}^{K}$ , $\lambda\in[0,\infty)$ is the prescribed risk tolerance level of the investor, while $\bm{\mu}\in\mathbb{R}^{K}$ and $\bm{\Sigma}\in\mathbb{S}^{K}$ are the true mean and covariance matrix of the asset returns, respectively. The objective of this problem is to determine the best vector of weights $\bm{x}\in\Delta^{K}$ that maximizes the mean portfolio return ${\bm{\mu}}^{\top}\bm{x}$ and that also minimizes the portfolio risk that is captured by the variance term $\bm{x}^{\top}{\bm{\Sigma}}\bm{x}$ . Here, the trade-off between these two terms is controlled by the scalar $\lambda$ in the objective function.

In practice, the true values of the parameters $\bm{\mu}$ and $\bm{\Sigma}$ are unknown and can only be estimated by using the available $N$ historical asset returns $\{\hat{\bm{\xi}}_{n}\}_{n\in[N]}$ , as follows:

[TABLE]

In the robust optimization setting, we assume that the precise location of each sample point $\hat{\bm{\xi}}_{n}$ is uncertain and is only known to belong to a prescribed uncertainty set $\Xi_{n}$ containing $\hat{\bm{\xi}}_{n}$ . To bring the resulting problem into the standard form (3), we introduce the expanded uncertainty set

[TABLE]

comprising the terms $\hat{\bm{\xi}}_{n}$ and ${\hat{\bm{\xi}}_{n}}-\hat{\bm{\mu}}$ , $n\in[N]$ . Using this uncertainty set, we arrive at the following robust version of (5):

[TABLE]

This problem constitutes an instance of (3) with the input parameters

[TABLE]

Example 2 (Robust Project Crashing).

Consider a project that is described by an activity-on-arc network $\mathcal{N}(\mathcal{V},\mathcal{A})$ , where $\mathcal{V}$ is the set of nodes representing the events, while $\mathcal{A}$ is the set of arcs representing the activities. We assume that that node with index $1$ represents the start of the project and the node with index $|\mathcal{V}|$ represents the end of the project. We define $d_{ij}\in[0,1]$ to be the nominal duration of the activity $(i,j)\in\mathcal{A}$ . Here, we assume that the durations $d_{ij}$ , $(i,j)\in\mathcal{A}$ , are already normalized so that they take values in the unit interval.

The goal of project crashing is to determine the best resource assignments $x_{ij}$ , $(i,j)\in\mathcal{A}$ , on the activities that minimize the project completion time or makespan. If the activity duration $d_{ij}-x_{ij}$ represents the length of the arc $(i,j)$ , then the project completion time can be determined by computing the length of the longest path from the start node to the end node. We can formulate project crashing as the optimization problem

[TABLE]

where

[TABLE]

If the task durations $\bm{d}$ are uncertain and are only known to belong to the prescribed uncertainty set $\mathcal{D}\subseteq[0,1]^{|\mathcal{A}|}$ , then we arrive at the robust optimization problem

[TABLE]

By combining the suprema over $\mathcal{D}$ and $\mathcal{Z}$ , and linearizing the bilinear terms $d_{ij}z_{ij}$ , $(i,j)\in\mathcal{A}$ , we can reformulate the objective of this problem as

[TABLE]

where

[TABLE]

Using the new objective function (7) and uncertainty set (8), the resulting robust optimization problem constitutes an instance of (3) with the input parameters $\bm{A}(\bm{x})=\bm{0}$ , $\bm{b}(\bm{x})=[\bm{0}^{\top}\;\;-\bm{x}^{\top}\;\;\mathbf{e}^{\top}]^{\top}$ , and $c(\bm{x})=0$ .

In the remainder of the paper, for any fixed $\bm{x}\in\mathcal{X}$ , we define the mixed-integer quadratic program

[TABLE]

which corresponds to the inner subproblem in the objective of (3). We may therefore represent (3) as

[TABLE]

In the next section, we derive exact copositive programming reformulation for evaluating $Z(\bm{x})$ . By substituting $Z(\bm{x})$ with the emerging copositive program, we obtain an equivalent finite-dimensional convex reformulation for the RQP (3) that is principally amenable to numerical solution.

3 Copositive Programming Reformulation

In this section, we derive an equivalent copositive programming reformulation for (3) by adopting the following steps. For any fixed $\bm{x}\in\mathcal{X}$ , we first derive a copositive upper bound on $Z(\bm{x})$ . We then show that the resulting reformulation is in fact exact under the boundedness assumption on the uncertainty set $\Xi$ .

3.1 A Copositive Upper Bound on $Z(\bm{x})$

To derive the copositive reformulation, we leverage the following result by Burer [11] which enables us to reduce a generic mixed-binary quadratic program into an equivalent conic program of polynomial size.

Theorem 1 ([11, Theorem 2.6]).

The mixed-binary quadratic program

[TABLE]

is equivalent to the completely positive program

[TABLE]

where $\mathcal{L}\subseteq[P]$ , and it is implicitly assumed that $\xi_{\ell}\leq 1,\;\ell\in\mathcal{L}$ , for any $\bm{\xi}\in\mathbb{R}_{+}^{P}$ satisfying $\bm{F}\bm{\xi}=\bm{g}$ .

We also rely on the following standard result which allows us to represent a scalar integer variable using only logarithmically many binary variables [47].

Lemma 1.

If $\xi$ is a scalar integer decision variable taking values in $\{0,\cdots,2^{Q}-1\}$ , with $Q\in\mathbb{Z}_{+}$ , then we can reformulate it concisely by employing $Q$ binary decision variables $\chi_{1},\cdots,\chi_{Q}\in\{0,1\}$ , as follows:

[TABLE]

Using Theorem 1 and Lemma 1, we are now ready to state our first result.

Proposition 1.

For any fixed decision $\bm{x}\in\mathcal{X}$ the optimal value of the quadratic maximization problem (9) coincides with the optimal value of the completely positive program

[TABLE]

where

[TABLE]

with

[TABLE]

Proof.

Lemma 1 enables us to reformulate the mixed-integer quadratic program (9) equivalently as the mixed-binary quadratic program

[TABLE]

We now employ Theorem 1 to derive the equivalent completely positive program for (13). To this end, we first bring the above quadratic program into the standard form (10). We introduce the redundant linear constraints $\bm{\chi}_{\ell}\leq\mathbf{e}$ , $\ell\in[L]$ , which are pertinent for the exactness of the reformulation, and we define new auxiliary slack variables $\bm{\eta}_{\ell}$ , $\ell\in[L]$ , to transform these inequalities into the equality constraints $\bm{\chi}_{\ell}+\bm{\eta}_{\ell}=\mathbf{e}$ , $\forall\ell\in[L]$ . This yields the equivalent problem

[TABLE]

We next define the expanded vector

[TABLE]

that comprises all decision variables in (14). Together with the augmented parameters (12), we can reformulate (14) concisely as the problem

[TABLE]

The mixed-binary quadratic program (15) already has the desired standard form (10) with inputs $P={K^{\prime}}$ , $\bm{Q}={\bm{\mathcal{A}}}(\bm{x})^{\top}{\bm{\mathcal{A}}}(\bm{x})$ , $\bm{r}={\bm{\mathscr{b}}}(\bm{x})$ , $\bm{F}={\bm{\mathcal{S}}}$ , $\bm{g}=\bm{t}$ , and $\mathcal{L}=[LQ]$ . We may thus apply Theorem 1 to obtain the equivalent completely positive program (11). This completes the proof. ∎

We remark that in view of the concise representation in Lemma 1, the size of the completely positive program (11) remains polynomial in the size of the input data. This completely positive program admits a dual copositive program given by

[TABLE]

By weak conic duality, the optimal value of this copositive program constitutes an upper bound on $Z(\bm{x})$ .

Proposition 2.

For any fixed decision $\bm{x}\in\mathcal{X}$ we have $\overline{Z}(\bm{x})\geq Z(\bm{x})$ .

3.2 A Copositive Reformulation of RQP

In this section, we demonstrate strong duality for the primal and dual pair (11) and (16), respectively, under the natural boundedness assumption on the uncertainty set $\Xi$ . This exactness result enables us to reformulate the RQP (3) equivalently as a copositive program of polynomial size.

Theorem 2 (Strong Duality).

For any fixed decision $\bm{x}\in\mathcal{X}$ we have $\overline{Z}(\bm{x})=Z(\bm{x})$ .

We would like to mention that a similar result is proved in a recent paper (Theorem 8 in [9]). However, the two proofs are quite different from one another. While the proof in [9] establishes strong duality by proving the existence of a Slater point for a general copositive program, we show explicitly how to construct a Slater point for the copositive program (16) from input parameters. Because of its constructive nature, we believe our proof is interesting on its own and sheds some light on the geometry of the feasible region of (16).

We note that the primal completely positive program (11) never has an interior [12]. In order to prove Theorem 2, we construct a Slater point for the dual copositive program (16). The construction of the Slater point for problem (16) relies on the following two lemmas. We observe that by construction the boundedness of the uncertainty set $\Xi$ means that the lifted polytope

[TABLE]

is also bounded. This gives rise to the following lemma on the strict copositivity of the matrix ${\bm{\mathcal{S}}}^{\top}{\bm{\mathcal{S}}}$ .

Lemma 2.

We have ${\bm{\mathcal{S}}}^{\top}{\bm{\mathcal{S}}}\succ_{\mathcal{C}}\bm{0}$ .

Proof.

The boundedness assumption implies that the recession cone of the set ${\Xi^{\prime}}$ coincides with the point $\bm{0}$ , that is, $\{{{{\bm{\xi}}^{\prime}}}\in\mathbb{R}_{+}^{{K^{\prime}}}:{\bm{\mathcal{S}}}{{\bm{\xi}}^{\prime}}=\bm{0}\}=\{\bm{0}\}$ . Thus, for every ${{{\bm{\xi}}^{\prime}}}\geq\bm{0}$ , ${{{\bm{\xi}}^{\prime}}}\neq\bm{0}$ , we must have ${\bm{\mathcal{S}}}{{{\bm{\xi}}^{\prime}}}\neq\bm{0}$ , which further implies that ${{{\bm{\xi}}^{\prime}}}^{\top}{\bm{\mathcal{S}}}^{\top}{\bm{\mathcal{S}}}{{{\bm{\xi}}^{\prime}}}>0$ for all ${{{\bm{\xi}}^{\prime}}}\geq\bm{0}$ such that ${{{\bm{\xi}}^{\prime}}}\neq\bm{0}$ . Hence, the matrix ${\bm{\mathcal{S}}}^{\top}{\bm{\mathcal{S}}}$ is strictly copositive. ∎

The next lemma, which was proven in [29, Lemma 4], constitutes an extension of the Schur complements lemma for matrices with a copositive sub-matrix. We include the proof here to keep the paper self-contained.

Lemma 3 (Copositive Schur Complements).

Consider the symmetric matrix

[TABLE]

We then have $\bm{M}\succ_{\mathcal{C}}\bm{0}$ if $\bm{R}-\bm{Q}^{\top}\bm{P}^{-1}\bm{Q}\succ_{\mathcal{C}}\bm{0}$ and $\bm{P}\succ\bm{0}$ .

Proof.

Consider a non-negative vector $[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]^{\top}\in\mathbb{R}_{+}^{P+Q}$ satisfying $\mathbf{e}^{\top}\bm{\xi}+\mathbf{e}^{\top}\bm{\rho}=1$ . We have

[TABLE]

The final inequality follows from the assumptions $\bm{P}\succ\bm{0}$ , $\bm{R}-\bm{Q}^{\top}\bm{P}^{-1}\bm{Q}\succ_{\mathcal{C}}\bm{0}$ and $\bm{\rho}\geq\bm{0}$ . In fact, the inequality will be strict, which can be shown by considering the following two cases:

If $\bm{\rho}=\bm{0}$ , then $\mathbf{e}^{\top}\bm{\xi}=1$ . Therefore $\bm{\xi}\neq\bm{0}$ , which implies that $(\bm{\xi}+\bm{P}^{-1}\bm{Q}\bm{\rho})^{\top}\bm{P}(\bm{\xi}+\bm{P}^{-1}\bm{Q}\bm{\rho})>0$ . 2. 2.

If $\bm{\rho}\neq\bm{0}$ , then the assumption $\bm{R}-\bm{Q}^{\top}\bm{P}^{-1}\bm{Q}\succ_{\mathcal{C}}\bm{0}$ implies that $\bm{\rho}^{\top}(\bm{R}-\bm{Q}^{\top}\bm{P}^{-1}\bm{Q})\bm{\rho}>0$ .

Therefore, in both cases, by rescaling we have $[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]\bm{M}[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]^{\top}>0$ for all $[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]^{\top}\in\mathbb{R}_{+}^{P+Q}$ such that $[\bm{\xi}^{\top}\;\bm{\rho}^{\top}]^{\top}\neq\bm{0}$ . Hence, $\bm{M}\succ_{\mathcal{C}}\bm{0}$ . ∎

Using Lemmas 2 and 3, we are now ready to prove the main strong duality result.

Proof of Theorem 2.

We construct a Slater point $(\tau,\bm{\psi},\bm{\phi},\bm{\gamma})$ for problem (16). Specifically, we set $\bm{\gamma}=\bm{0}$ , $\bm{\psi}=\bm{0}$ , and $\bm{\phi}=\rho\mathbf{e}$ for some $\rho>0$ . Problem (16) then admits a Slater point if there exist scalars $\rho,\tau>0$ , such that

[TABLE]

Lemma 2 implies that for a sufficiently large $\rho$ the matrix $\rho{\bm{\mathcal{S}}}^{\top}{\bm{\mathcal{S}}}-{\bm{\mathcal{A}}}(\bm{x})^{\top}{\bm{\mathcal{A}}}(\bm{x})$ is strictly copositive. Thus, we can choose a positive $\tau$ to ensure that

[TABLE]

Using Lemma 3, we may conclude that the strict copositivity constraint in (18) is satisfied by the constructed solution $(\tau,\bm{\psi},\bm{\phi},\bm{\gamma})$ . Thus, problem (16) admits a Slater point and strong duality indeed holds for the primal and dual pair (11) and (16), respectively. ∎

The exactness result portrayed in Theorem 2 enables us to derive the equivalent copositive programming reformulation for (3).

Theorem 3.

The RQP (3) is equivalent to the following copositive program.

[TABLE]

The proof of Theorem 3 relies on the following lemma, which linearizes the quadratic term ${\bm{\mathcal{A}}}(\bm{x})^{\top}{\bm{\mathcal{A}}}(\bm{x})$ in the left-hand side matrix of problem (16).

Lemma 4.

Let $\bm{M}\in\mathbb{S}^{R}$ be a symmetric matrix and $\bm{A}\in\mathbb{R}^{P\times Q}$ be an arbitrary matrix with $Q\leq R$ . Then the copositive inequality

[TABLE]

is satisfied if and only if there exists a positive semidefinite matrix $\bm{H}\in\mathbb{S}_{+}^{Q}$ such that

[TABLE]

Proof.

The only if statement is satisfied immediately by setting $\bm{H}=\bm{A}^{\top}\bm{A}$ . To prove the converse statement, assume that there exists such a positive semidefinite matrix $\bm{H}\in\mathbb{S}_{+}^{Q}$ . Then by the Schur complement the semidefinite inequality in (21) implies that $\bm{H}\succeq\bm{A}^{\top}\bm{A}$ and, a fortiori, $\bm{H}\succeq_{\mathcal{C}}\bm{A}^{\top}\bm{A}$ . Combining this with the copositive inequality in (21) then yields (20). Thus, the claim follows. ∎

Proof of Theorem 3.

Applying Theorem 2, we may replace the objective function of (3) with the corresponding copositive reformulation, we thus find that problem (3) is equivalent to

[TABLE]

Next, we apply Lemma 4 to linearize the quadratic terms ${\bm{\mathcal{A}}}(\bm{x})^{\top}{\bm{\mathcal{A}}}(\bm{x})$ , which gives rise to the desired copositive program (19). This completes the proof. ∎

4 Conservative Semidefinite Programming Approximation

The copositive program (19) is intractable due to its equivalence with generic RQPs over a polyhedral uncertainty set [4]. In the copositive reformulation, however, all the difficulty of the original problem (3) is shifted into the copositive cone $\mathcal{C}$ , which has been well-studied in the literature. Specifically, there exists a hierarchy of increasingly tight semidefinite representable inner approximations that converge in finitely many iterations to $\mathcal{C}$ [40, 10, 16, 33]. The simplest of these approximations is given by the cone

[TABLE]

which contains all symmetric matrices that can be decomposed into a sum of positive semidefinite and non-negative matrices. For dimensions $K\leq 4$ it can be shown that $\mathcal{C}^{0}=\mathcal{C}$ [18], while for $K>4$ , $C^{0}$ is a strict subset of $\mathcal{C}$ .

Replacing the cone $\mathcal{C}$ in (19) with the inner approximation $\mathcal{C}^{0}$ gives rise to a tractable conservative approximation for the RQP (3). In this case, however, the resulting optimization problem might have no interior or even become infeasible as the Slater point constructed in Theorem 2 can fail to be a Slater point to the restricted problem. Indeed, the strict copositivity of the matrix ${\bm{\mathcal{S}}}^{\top}{\bm{\mathcal{S}}}$ is in general insufficient to ensure that the matrix is also strictly positive definite. To remedy this shortcoming, we suggest the following simple modification to the primal completely positive formulation of $Z(\bm{x})$ in (11). Specifically, we assume that there exists a non-degenerate ellipsoid centered at $\bm{c}\in\mathbb{R}_{+}^{{K^{\prime}}}$ with radius $r\in\mathbb{R}_{++}$ and shape parameter $\bm{Q}\in\mathbb{S}_{++}^{{K^{\prime}}}$ given by

[TABLE]

that contains the lifted set ${\Xi^{\prime}}$ in (17). We then consider the following augmented completely positive programming reformulation for the maximization problem (9).

[TABLE]

Here, we have added the redundant constraint $\textup{tr}\left(\bm{Q}\bm{\Omega}\bm{Q}^{\top}\right)-2\bm{c}^{\top}\bm{Q}^{\top}\bm{Q}{{{\bm{\xi}}^{\prime}}}+\bm{c}^{\top}\bm{Q}^{\top}\bm{Q}\bm{c}\leq r^{2}$ to (11), which arises from linearizing the quadratic constraint

[TABLE]

where we have set $\bm{\Omega}={{\bm{\xi}}^{\prime}}{{{\bm{\xi}}^{\prime}}}^{\top}$ . The dual of the augmented problem (22) is given by the following copositive program.

[TABLE]

Note that we have $Z(\bm{x})=\overline{Z}(\bm{x})$ since all the new additional terms are redundant for the original reformulations. Nevertheless, since the ellipsoid $\mathcal{B}(r,\bm{Q},\bm{c})$ is non-degenerate, we find that the matrix $\bm{Q}^{\top}\bm{Q}$ is positive definite. We can thus set all eigenvalues of the scaled matrix $\lambda\bm{Q}^{\top}\bm{Q}$ to any arbitrarily large positive values by controlling the scalar $\lambda\in\mathbb{R}_{+}$ . This suggests that replacing the cone $\mathcal{C}$ with its inner approximation $\mathcal{C}^{0}$ in (23) will always yield a problem with a Slater point.

Apart from helping us prove the existence of a Slater point, adding an ellipsoidal constraint to the description of the uncertainty set can also be of help numerically. Although, the constraint is redundant for the exact problem, it might not be redundant for the conservative approximation obtained by replacing $\mathcal{C}$ with $\mathcal{C}^{0}$ . Adding the constraint results in an additional variable $\lambda$ in the SDP approximation, which can improve the objective value. Ideally, we would like the volume of the ellipsoid to be as small as possible to get more improvement. However, determining the parameters of the ellipsoid having minimum volume that encloses the set ${\Xi}$ is NP-hard. A feasible ellipsoid that can be generated tractably is $\{\bm{\xi}\in\mathbb{R}^{K}:\|\bm{\xi}\|\leq\left\lVert\bm{r}\right\rVert\}$ , where

[TABLE]

Note that the parameter $\bm{r}$ of the ellipsoid can be determined by solving $K$ linear programs. Depending on the specific uncertainty set at hand, it might be possible to find other tighter ellipsoidal approximations.

4.1 Comparison with the Approximate $\mathcal{S}$ -lemma Method

Next, we show that solving the problem by replacing $\mathcal{C}$ with the simplest inner approximation $\mathcal{C}^{0}$ is better than the approximate $\mathcal{S}$ -lemma method. Since the latter is only valid in the case of continuous uncertain parameters, we restrict the discussion to the case where the bounded uncertainty set contains no integral terms and is given by the polytope $\Xi=\left\{\bm{\xi}\in\mathbb{R}_{+}^{K}:\bm{S}\bm{\xi}=\bm{t}\right\}$ . Here, the extended parameters (12) simplify to

[TABLE]

while the maximization problem (9) reduces to

[TABLE]

The copositive programming reformulation (23) can then be simplified to

[TABLE]

Replacing the cone $\mathcal{C}$ in (25) with its inner approximation $\mathcal{C}^{0}$ , we obtain a tractable SDP reformulation whose optimal value $\overline{Z}^{\mathcal{C}_{0}}(\bm{x})$ constitutes an upper bound on ${Z}(\bm{x})$ . Alternatively, we describe the approximate $\mathcal{S}$ -lemma method below, which provides a different conservative SDP approximation for (24).

Proposition 3 (Approximate $\mathcal{S}$ -lemma Method [4]).

Assume that the uncertainty set is a bounded polytope and there is an ellipsoid centered at $\bm{c}\in\mathbb{R}_{+}^{K}$ of radius $r$ given by $\mathcal{B}(r,\bm{Q},\bm{c})=\{{{\bm{\xi}}}\in\mathbb{R}^{K}:\|\bm{Q}({{\bm{\xi}}}-\bm{c})\|\leq r\}$ that contains the set ${\Xi}$ . Then, for any fixed $\bm{x}\in\mathcal{X}$ , the maximization problem (9) is upper bounded by the optimal value of the following semidefinite program:

[TABLE]

Proof.

The quadratic maximization problem in (24) can be equivalently reformulated as

[TABLE]

Here, the last constraint is added without loss generality since $\Xi\subseteq\mathcal{B}(r,\bm{Q},\bm{c})$ . Reformulating the problem into its Lagrangian form then yields

[TABLE]

where the inequality follows from the weak Lagrangian duality. We next introduce an epigraphical variable $\kappa$ that shifts the supremum in the objective function into the constraint. We have

[TABLE]

Reformulating the semi-infinite constraint as a semidefinite constraint then yields the desired reformulation (26). This completes the proof. ∎

The next proposition shows that the approximation resulting from replacing the copositive cone $\mathcal{C}$ in (25) with its coarsest inner approximation $\mathcal{C}^{0}$ is stronger than the state-of-art approximate $\mathcal{S}$ -lemma method.

Proposition 4.

The following relation holds.

[TABLE]

Proof.

The equality and the first inequality hold by construction. To prove the second inequality, we consider the following semidefinite program that arises from replacing the cone $\mathcal{C}$ with the inner approximation $\mathcal{C}^{0}$ in (25).

[TABLE]

Next, we show that any feasible solution $(\kappa,\rho,\bm{\theta},\bm{\eta})$ to (26) can be used to construct a feasible solution $(\tau,\lambda,h,\bm{\psi},\bm{\phi},\bm{F},\;\bm{g})$ to (27) with the same objective value. Specifically, we set $\tau=\kappa$ , $\lambda=\rho$ , $h=0$ , $\bm{\psi}=\bm{\theta}$ , $\bm{\phi}=\bm{0}$ , $\bm{F}=\bm{0}$ , and $\bm{g}=\bm{\eta}$ . The feasibility of the solution $(\kappa,\rho,\bm{\theta},\bm{\eta})$ in (26) then implies that the constructed solution $(\tau,\lambda,h,\bm{\psi},\bm{\phi},\bm{F},\;\bm{g})$ is also feasible in (27). One can verify that these solutions give rise to the same objective function value for the respective problems. Thus, the claim follows. ∎

Next, we demonstrate that the inequality in $\overline{Z}^{0}(\bm{x})\leq\overline{Z}^{\mathcal{S}}(\bm{x})$ in Proposition 4 can often be strict. This affirms that the proposed SDP approximation (27) is indeed stronger than the approximate $\mathcal{S}$ -lemma method.

Example 3.

Consider the following quadratic maximization problem:

[TABLE]

A simple analysis shows that $Z(\bm{x})=1$ , which is attained at the solution $(\xi_{1},\xi_{2})=(1,0)$ . The problem (28) constitutes an instance of problem (24) with the parameterizations

[TABLE]

Here, the uncertainty set is given by the polytope $\Xi=\{\bm{\xi}\in\mathbb{R}_{+}^{2}:2\xi_{1}+\xi_{2}=2\}$ , which corresponds to the inputs $\bm{S}=[2\;1]$ and $\bm{t}=2$ . Replacing the cone $\mathcal{C}$ with its inner approximation $\mathcal{C}^{0}$ in the copositive programming reformulation of (28), we find that the resulting semidefinite program yields the same optimal objective value of $\overline{Z}^{\mathcal{C}_{0}}(\bm{x})=1$ . Meanwhile, the corresponding approximate $\mathcal{S}$ -lemma method yields an optimal objective value $Z^{\mathcal{S}}(\bm{x})=4$ . Thus, while the SDP approximation of the copositive program (25) is tight, the approximate $\mathcal{S}$ -lemma generates an inferior objective value for the simple instance (28).

5 Extensions

In this section, we discuss several extensions to the RQP (3) which are also amenable to exact copositive programming reformulation. In Section 5.1, we study two-stage robust optimization with mixed-integer uncertainty set where the objective is quadratic in the first- and the second-stage decision variables. In Section 5.2, we develop an extension to the case when the model has robust quadratic constraints. Finally, in Section 5.3, we discuss the case where the objective function contains quadratic terms which are not convex in the uncertain parameter vector $\bm{\xi}$ .

5.1 Two-Stage Robust Quadratic Optimization

In this section, we study the two-stage robust quadratic optimization problems of the form

[TABLE]

Here, for any fixed decision $\bm{x}\in\mathcal{X}$ and uncertain parameter realization $\bm{\xi}\in\Xi$ , the second-stage cost $\mathcal{R}(\bm{x},\bm{\xi})$ coincides with the optimal value of the convex quadratic program given by

[TABLE]

where $\bm{T}(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}^{T\times K}$ and $\bm{h}(\bm{x}):\mathcal{X}\rightarrow\mathbb{R}^{T}$ are matrix- and vector-valued affine functions, respectively.

Example 4 (Support Vector Machines with Noisy Labels).

Consider the following soft-margin support vector machines (SVM) model for data classification.

[TABLE]

Here, for every index $n\in[N]$ , the vector $\hat{\bm{\chi}}_{n}\in\mathbb{R}^{K}$ is a data point that has been labeled as $\hat{\xi}_{n}\in\{-1,1\}$ . The objective of problem (31) is to find a hyperplane $\{\bm{\chi}\in\mathbb{R}^{K}:\bm{w}^{\top}\bm{\chi}=w_{0}\}$ that separates all points labeled $+1$ with the ones labeled $-1$ . If the hyperplane satisfies $\hat{\xi}_{n}(\bm{w}^{\top}\hat{\bm{\chi}}_{n}-w_{0})>1$ , $n\in[N]$ , then the data points are linearly separable. In practice, however, these data points may not be linearly separable. We thus seek the best linear separator that minimizes the number of incorrect classifications. This non-convex objective is captured by employing the hinge loss term $\sum_{n\in[N]}\max\left\{0,1-\hat{\xi}_{n}(\bm{w}^{\top}\hat{\bm{\chi}}_{n}-w_{0})\right\}$ in (31) as a convex surrogate. Here, the term $\lambda\|\bm{w}\|^{2}$ in the objective function constitutes a regularizer for the coefficient $\bm{w}$ .

If the labels $\{\hat{\xi}_{n}\}_{n\in[N]}$ are erroneous, then one could envisage a robust optimization model that seeks the best linear separator in view of the most adverse realization of the labels. To this end, we assume that the vector of labels $\bm{\xi}$ is only known to reside in a prescribed binary uncertainty set $\Xi\subseteq\{-1,1\}^{N}$ . Then an SVM model that is robust against uncertainty in the labels can be formulated as

[TABLE]

where

[TABLE]

This problem constitutes an instance of (29) with the decision vector $\bm{x}=(\bm{w},w_{0})$ , and the input parameters

[TABLE]

The exactness result portrayed in Theorems 2 and 3 can be extended to the two-stage robust optimization problem (29). Specifically, if the problem has a complete recourse111The two-stage problem (29) has complete recourse if there exists $\bm{y}^{+}\in\mathbb{R}^{D_{2}}$ with $\bm{W}\bm{y}^{+}>\bm{0}$ , which implies that the second-stage subproblem is feasible for every $\bm{x}\in\mathbb{R}^{D_{1}}$ and $\bm{\xi}\in\mathbb{R}^{K}$ . then, by employing Theorem 2 and extending the techniques developed in [29, Theorem 4], the two-stage problem (29) can be reformulated as a copositive program of polynomial size.

Theorem 4.

Assume that $\bm{P}$ has full column rank. Then the two-stage robust optimization problem (29) is equivalent to the copositive program

[TABLE]

where

[TABLE]

with

[TABLE]

Proof.

Since $\bm{P}$ has full column rank, the matrix $\bm{P}^{\top}\bm{P}$ is positive definite. Thus, for any fixed $\bm{x}\in\mathcal{X}$ and $\bm{\xi}\in\Xi$ , the recourse problem (30) admits a dual quadratic program given by

[TABLE]

Strong duality holds as the two-stage problem (29) has complete recourse. Substituting the dual formulation (33) into the objective of (29) yields

[TABLE]

Thus, for any fixed $\bm{x}\in\mathcal{X}$ , the objective value of the two-stage problem (29) coincides with the optimal value of a quadratic maximization problem, which is amenable to an exact completely positive programming reformulation similar to the one derived in Proposition 1. We can then follow the same steps taken in the proofs of Theorems 2 and 3 to obtain the equivalent copositive program (32). This completes the proof. ∎

Remark 1.

The assumption that $\bm{P}$ has full column rank in Theorem 4 can be relaxed. If $\bm{P}$ does not have full column rank then the symmetric matrix $\bm{P}^{\top}\bm{P}$ is not positive definite but admits the eigendecomposition $\bm{P}^{\top}\bm{P}=\bm{U}\bm{\Lambda}\bm{U}^{-1}$ , where $\bm{U}$ is an orthogonal matrix whose columns are the eigenvectors of $\bm{P}^{\top}\bm{P}$ , while $\bm{\Lambda}$ is a diagonal matrix with the eigenvalues of $\bm{P}^{\top}\bm{P}$ on its main diagonal. We assume without loss of generality that the matrix $\bm{\Lambda}$ has the block diagonal form

[TABLE]

where $\bm{\Lambda}_{+}$ is a diagonal matrix whose main diagonal comprises the non-zero eigenvalues of $\bm{P}^{\top}\bm{P}$ . Next, by using the constructed eigendecomposition and performing the change of variable $\bm{z}\leftarrow\bm{U}^{-1}\bm{y}$ , we can reformulate the recourse problem (30) equivalently as

[TABLE]

where $\bm{U}=[\bm{U}_{+}\;\;\bm{U}_{0}]$ and $\bm{z}=[\bm{z}_{+}^{\top}\;\;\bm{z}_{0}^{\top}]^{\top}$ . The dual of this problem is given by the following quadratic program with a linear constraint system:

[TABLE]

We can then repeat the same steps in the proof of Theorem 4 to obtain an equivalent copositive programming reformulation. We omit this result for the sake of brevity.

5.2 Robust Quadratically Constrained Quadratic Programming (RQCQP)

The setting that we consider can be extended to the case where, in addition to the robust quadratic objective function, there are several robust quadratic constraints of the form

[TABLE]

In this case, the goal is to find a decision $\bm{x}\in\mathcal{X}$ which minimizes the worst-case objective function, while ensuring that the quadratic constraints are satisfied for all possible uncertain parameter vectors in $\Xi$ .

For every $i\in[I]$ , we define ${\bm{\mathcal{A}}}_{i}(\bm{x})$ and ${\bm{\mathscr{b}}}_{i}(\bm{x})$ similarly to the definitions of the extended parameters ${\bm{\mathcal{A}}}(\bm{x})$ and ${\bm{\mathscr{b}}}(\bm{x})$ in (12). By applying Theorem 2, the quadratic maximization problem in the $i$ -th constraint of (34) can be replaced with a copositive minimization problem, which yields the constraint

[TABLE]

The constraint is satisfied if and only if there exist decision variables $\tau_{i}\in\mathbb{R}$ , $\bm{\psi}_{i}$ , $\bm{\phi}_{i}\in\mathbb{R}^{{J^{\prime}}}$ , and $\bm{\gamma}_{i}\in\mathbb{R}^{LQ}$ such that the constraint system

[TABLE]

is satisfied. Therefore the $i$ -th constraint of (34) can be replaced by the constraint system (35). The procedure for linearization of the quadratic terms ${\bm{\mathcal{A}}}_{i}(\bm{x})^{\top}{\bm{\mathcal{A}}}_{i}(\bm{x})$ is analogous to the method presented in Theorem 3.

5.3 Non-Convex Terms in the Objective Function

All exactness results in this paper extend immediately to the setting where the objective function in (3) involves non-convex quadratic terms in the uncertainty $\bm{\xi}$ . Specifically, we consider the objective function

[TABLE]

where $\bm{D}(\bm{x}):\mathcal{X}\rightarrow\mathbb{S}^{K}$ is a matrix-valued affine function of $\bm{x}$ . We can still use Theorem 1 to reformulate $Z(\bm{x})$ as the optimal value of a copositive program. By following the steps of Proposition 1 and Theorem 3, the copositive programming reformulation is obtained by replacing the last constraint in (19) with the copositive constraint

[TABLE]

where

[TABLE]

We omit the details for the sake of brevity.

6 Numerical Experiments

In this section, we assess the performance of the SDP approximations presented in Section 4. All optimization problems are solved using the YALMIP interface [35] on a 16-core 3.4 GHz computer with 32 GB RAM. We use MOSEK 8.1 to solve SDP formulations, and CPLEX 12.8 to solve integer programs and non-convex quadratic programs.

6.1 Least Squares

The classical least squares problem seeks an approximate solution $\bm{x}$ to an overdetermined linear system $\bm{A}\bm{x}=\bm{b}$ which minimizes the residual $\|\bm{A}\bm{x}-\bm{b}\|^{2}$ . This yields the following quadratic program:

[TABLE]

The solution to this problem can be very sensitive to perturbations in the input data $\bm{A}\in\mathbb{R}^{M\times N}$ and $\bm{b}\in\mathbb{R}^{M}$ [19, 26]. To address the issue of parameter uncertainty, El Ghaoui and Lebret [23] recommend solving the following robust optimization problem:

[TABLE]

Here, the goal is to find a solution $\bm{x}$ that minimizes the worst-case residual when the matrix $\bm{U}$ and the vector $\bm{v}$ can vary within the prescribed uncertainty set $\mathcal{U}$ . A tractable SDP reformulation of this problem is derived in [23] for problem instances where the uncertainty set is given by the Frobenius norm ball

[TABLE]

We consider the case when the uncertainty set is a polytope, and compare our SDP scheme with the state-of-the-art approximate $\mathcal{S}$ -lemma method described in Section 4.1. We also compare our method with the approximation scheme proposed by Bertsimas and Sim [7], where the worst-case quadratic term in (2) is replaced with an upper bounding function. Minimizing this upper-bounding function over $\bm{x}$ yields an approximate solution to the RQP. We note that the robust least squares problem can be solved to optimality using Benders’ constraint generation method [8]. However, doing so entails solving a non-convex quadratic optimization problem at each step to generate a valid cut, which becomes intractable when $M$ and $N$ become large.

In our experiment, we consider the case where the uncertainty affects only the right-hand side vector $\bm{b}$ (i.e., $\bm{U}=\bm{0}$ ). We assume that the uncertain parameter $\bm{v}$ depends affinely on $N_{f}$ factors represented by $\bm{\xi}\in\mathbb{R}^{N_{f}}$ , where $N_{f}<M$ . Specifically, we consider the uncertainty set

[TABLE]

where $\bm{F}\in\mathbb{R}^{M\times N_{f}}$ is the factor matrix and $\rho$ lies in the interval $[0,1]$ . By substituting $\bm{U}=\bm{0}$ and $\bm{v}=\bm{F}\bm{\xi}$ into (36), the resulting robust problem constitutes an instance of RQP (3) with the following input parameters:

[TABLE]

In order to solve the problem using our method, we modify the formulation discussed in Section 4 slightly, which leads to a tremendous reduction in the solution time. We discuss this modification in Appendix A.

We perform an experiment on problem instances of dimensions $M=200$ , $N=20$ and $N_{f}=30$ . The experimental results are averaged over $100$ random trials generated in the following manner. In each trial, we sample the matrix $\bm{A}$ and the vector $\bm{b}$ from the uniform distribution on $[-0.5,0.5]^{M\times N}$ and $[-0.5,0.5]^{N}$ , respectively. Each row of the matrix $\bm{F}$ is sampled randomly from a standard simplex, and $\rho$ is generated uniformly at random from the interval $[0.1,0.25]$ . For problems of this size, we are unable to solve the problem to optimality using Benders’ method as the solver runs out of memory. Therefore, we put a time limit of $120$ seconds for each iteration of the Benders’ method. By doing so, Benders’ method yields a lower bound to the optimal worst-case residual, which we use as a baseline to compute the objective gaps for the approximation methods.

Table 1 summarizes the optimality gaps of the approximation methods. The results show that our method significantly outperforms the other two approximations in terms of the estimates of the worst-case residuals. While the other two approximations generate overly pessimistic estimates of the resulting worst-case residuals (with a relative difference of about 100%), the worst-case residuals estimated using our method have negligible objective gaps.

Table 2 reports the solution times of finding the exact solution (using Benders’ method) and the upper bounds provided by various approximation methods. It can be observed that the improvement in solution quality given by our method comes at the cost of longer solution times compared to other approximation methods. However, our method is still significantly faster than the exact Benders’ method. We also note that while the approximation scheme described in [7] can be solved quickly, it is only valid when the uncertainty set is defined as a norm-bounded set ( $l_{1}\cap l_{\infty}$ norm in our experiment). Our method, on the other hand, is applicable for general polyhedral uncertainty sets.

6.2 Project Management

In this experiment, we consider the project crashing problem described in Example 2, where the duration of activity $(i,j)\in\mathcal{A}$ is given by the uncertain quantity $d_{ij}=(1+r_{ij})d_{ij}^{0}$ . Here, $d_{ij}^{0}$ is the nominal activity duration and $r_{ij}$ represents exogenous fluctuations. We consider randomly generated project networks of size $|\mathcal{V}|=30$ and order strength 0.75,222The order strength denotes the fraction of all $|\mathcal{V}|(|\mathcal{V}|-1)/2$ possible precedences between the nodes that are enforced in the graph (either directly or through transitivity). which gives rise to projects with an average of $67$ activities. Let $x_{ij}$ be the amount of resources that are used to expedite the activity $(i,j)$ . We fix the feasible set of the resource allocation vector to $\mathcal{X}=\{\bm{x}\in[0,1]^{|\mathcal{A}|}:\mathbf{e}^{\top}\bm{x}\leq\frac{3}{4}|\mathcal{A}|\}$ , so that at most $75\%$ of the activities can receive the maximum resource allocation. The uncertainty set of $\bm{d}$ is defined through a factor model as follows:

[TABLE]

where the factor size is fixed to $N_{f}=|\mathcal{V}|$ . We set the nominal task durations to $\bm{d}^{0}=\mathbf{e}$ . In each trial, we sample the factor loading vector $\bm{f}_{ij}$ from the uniform distribution on $[-\frac{1}{2N_{f}},\frac{1}{2N_{f}}]^{N_{f}}$ , which ensures that the duration of each activity can deviate by up to $50\%$ of its nominal value. We can form the final mixed-integer uncertainty set $\Xi$ from $\mathcal{D}$ using the procedure described in Example 2 (Equation (8)).

In our experiment, we compare the performance of our proposed SDP approximation with linear decision rules (LDR) approximation scheme discussed in [14, 48] which we describe below. In Example 2, for our reformulation, we model the second stage problem as the maximization problem over the binary variables $\bm{z}$ (See Equation (6)). Alternatively, the second-stage problem can be written as the following minimization problem:

[TABLE]

Here, $\bm{\rho}$ is second-stage variable which depends on the realization of the uncertain $\bm{d}$ . In the LDR approximation scheme, $\bm{\rho}$ is restricted to be an affine function of $\bm{d}$ , which yields a tractable conservative approximation. To assess the suboptimality of our SDP and the LDR approximation scheme, we solve the problem to optimality using Benders’ constraint generation method.

Table 3 presents the optimality gaps of the two approximation methods for $100$ randomly generated project networks. The solution times of all the methods are reported in Table 4. It can be observed that our proposed SDP approximation consistently provides near-optimal estimates of the worst-case project makespan ( $\sim 2.7\%$ gaps). On the other hand, while the LDR bound can be computed quickly, the bounds are too pessimistic ( $\sim 27\%$ gaps). The $10$ th and $90$ th percentiles of the objective gaps further indicate that the estimated makespan generated from our SDP approximation stochastically dominates the makespan generated from the LDR approximation. In addition to a higher estimate of the worst-case makespan, the actual makespan of the resource allocation $\bm{x}$ generated by the LDR approximation is also higher than the ones generated by our method, as shown in the “Suboptimality” column in Table 3. The experimental results demonstrate that our method generates near-optimal solutions to the project crashing problem faster than solving the problem to optimality using Benders’ method.

6.3 Multi-Item Newsvendor

We now demonstrate the advantage of using a mixed-integer uncertainty set over using a continuous uncertainty set in a variant of the multi-item newsvendor problem, where an inventory planner must determine the vector $\bm{x}\in\mathbb{R}_{+}^{N}$ of order quantities for $N$ different raw-materials at the beginning of a planning period. The raw materials are used to make $K$ different types of products which are then sold to customers. The matrix $\bm{F}\in\mathbb{R}^{N\times K}$ is such that $F_{nk}$ represents the amount of raw material $n$ required to make $1$ unit of product $k$ . The demands $\bm{\xi}\in\mathbb{Z}_{+}^{K}$ for these products are uncertain and are assumed to belong to a prescribed discrete uncertainty set $\Xi$ . We assume that there are no ordering costs on the raw materials but the total order quantity must not exceed a given budget $B$ . Excess inventory of the $n$ -th raw material incurs a per-unit holding cost of $g_{n}$ , while the unmet demand incurs a quadratic penalty with coefficient $\lambda$ . The quadratic penalty on the unmet demand is added to discourage stock-outs [30, 49].

For any realization of the demand vector $\bm{\xi}$ , the total cost of a fixed order $\bm{x}$ is given by

[TABLE]

Here, we use the notation $z^{+}$ to denote $\max\{z,0\}$ . The objective of a risk-averse inventory planner is then to determine a vector of order quantities $\bm{x}$ that minimizes the worst-case total cost $\sup_{\bm{\xi}\in\Xi}\mathcal{R}(\bm{x},\bm{\xi})$ . This gives rise to the optimization problem

[TABLE]

This problem constitutes an instance of the two-stage robust quadratic optimization problem (29) with parameters

[TABLE]

In this experiment, we compare the performance of the SDP approximation of the optimization problem (37) when $\bm{\xi}$ is explicitly modeled as a discrete vector versus the model where the integer restriction on $\bm{\xi}$ is ignored. We consider problems with $N=8$ raw materials and $K=5$ products. We fix the vector of holding costs to $\bm{g}=\mathbf{e}$ , the ordering budget to $B=20$ , and the penalty constant to $\lambda=10$ . All experimental results are averaged over $100$ random trials generated in the following manner. We assume that every product uses one unit each of two randomly chosen raw materials. In each trial, we generate every element of $\bm{G}\in\mathbb{R}^{2\times K}$ uniformly at random from the interval $[0,1]$ . We define the actual discrete uncertainty set ( $\Xi_{\rm True}$ ) and the set formed by ignoring the integrality assumption ( $\Xi_{\rm Cont}$ ) as:

[TABLE]

and solve the SDP approximations of (37) with inputs $\Xi_{\rm True}$ and $\Xi_{\rm Cont}$ . We use the Benders’ constraint generation method to solve the problem to optimality.

The statistics of the optimality gaps generated by the models using $\Xi_{\rm True}$ and $\Xi_{\rm Cont}$ are reported in Table 5. The solution times of all the methods are presented in Table 6. We observe that the model using $\Xi_{\rm True}$ as the uncertainty set provides much better estimates of the worst-case cost ( $\sim 13\%$ average gap) than the model using $\Xi_{\rm Cont}$ ( $\sim 85\%$ average gap). Furthermore, our proposed SDP approximation can be solved much faster than solving the problem exactly using Benders’ method. For problems with integer uncertainty, these experimental results suggest that the SDP approximation which utilizes the integer restriction gives high-quality solutions in comparison to the approximation which neglects these restrictions.

7 Conclusion

The paper aims at developing a near-optimal approximation method for one- and two-stage robust quadratic programs with mixed-integer uncertain parameters. The approximation method developed in the paper is not only more general than the current state-of-the-art approximate $\mathcal{S}$ -lemma method—since the latter only handles continuous uncertain parameters—but is guaranteed to yield a better estimate of the optimal value. Furthermore, our numerical experiments show that the difference in the performance of the two approximation method can be quite significant. Our experimental results also demonstrate the disadvantage of ignoring the integer restrictions on the uncertain parameters. In the future, it would be interesting to extend the model to the distributionally robust setting, where additional information about the distribution of the uncertain parameters is explicitly incorporated.

Appendix A Implementation of Least Squares in Section 6.1

In this section, we limit the discussion to the case when there are no discrete uncertain parameters. In the paper, we consider the uncertainty set to be of the standard form $\Xi_{S}:=\{\bm{\xi}\geq\bm{0}:\bm{S}\bm{\xi}=\bm{t}\}$ . However, in some cases, the uncertainty sets are more naturally represented in the inequality form $\Xi_{I}:=\{\bm{\xi}:\bm{S}\bm{\xi}\leq\bm{t}\}$ . Transforming the uncertainty set in standard form involves introducing additional variables and constraints which increases the problem size. As an example, in the least squares experiment in Section 6.1, we consider the uncertainty set $\Xi=\left\{\bm{\xi}\in\mathbb{R}^{N_{f}}:\;\left\lVert\bm{\xi}\right\rVert_{\infty}\leq 1,\;\left\lVert\bm{\xi}\right\rVert_{1}\leq\rho N_{f}\right\}$ . By lifting, the uncertainty set can be equivalently written as $\Xi_{LS}=\left\{(\bm{\xi},\bm{\gamma}):\bm{\xi}\in\mathbb{R}^{N_{f}},\;\bm{\gamma}\in\mathbb{R}^{N_{f}},\;-\bm{\xi}\leq\bm{\gamma},\;\bm{\xi}\leq\bm{\gamma},\;\bm{\gamma}\leq\mathbf{e},\;\mathbf{e}^{\top}\bm{\gamma}\leq\rho N_{f}\right\}$ , which is of the form $\Xi_{I}$ . The paper [12] presents a generalized copositive programming (GCP) reformulation of non-convex quadratic programs over conic representable sets. In [50], the authors consider a conservative approximation when the cone is polyhedral, which is relevant for the polyhedral uncertainty sets that we consider. Utilizing this GCP-based approximation, the robust least squares problem

[TABLE]

that we consider in Section 6.1 yields the following conservative SDP approximation:

[TABLE]

We use this formulation with the uncertainty set $\Xi_{LS}$ for our experiment in Section 6.1. Skipping the conversion to the standard form generates same the objective value, but reduces the average solution time from $85$ seconds to about $10$ seconds. We emphasize that this alternate formulation might not be valid when some of the components of $\bm{\xi}$ are restricted to be integers. Therefore, it is not straightforward to apply it to the project management and the newsvendor experiments, both of which contain discrete uncertain parameters.

Bibliography51

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. D. Ahipasaoglu, K. Natarajan, and D. Shi. Distributionally robust project crashing with partial or no correlation information. 2016.
2[2] A. Ben-Tal, L. El Ghaoui, and A. Nemirovski. Robust Optimization . Princeton University Press, 2009.
3[3] A. Ben-Tal, A. Goryashko, E. Guslitzer, and A. Nemirovski. Adjustable robust solutions of uncertain linear programs. Mathematical Programming A , 99(2):351–376, 2004.
4[4] A. Ben-Tal and A. Nemirovski. Robust convex optimization. Mathematics of Operations Research , 23(4):769–805, 1998.
5[5] A. Ben-Tal, A. Nemirovski, and C. Roos. Robust solutions of uncertain quadratic and conic-quadratic problems. SIAM Journal on Optimization , 13(2):535–560, 2002.
6[6] D. Bertsimas, D. B. Brown, and C. Caramanis. Theory and applications of robust optimization. SIAM review , 53(3):464–501, 2011.
7[7] D. Bertsimas and M. Sim. Tractable approximations to robust conic optimization problems. Mathematical Programming B , 107(1-2):5–36, 2006.
8[8] J.W. Blankenship and J. E. Falk. Infinitely constrained optimization problems. Journal of Optimization Theory and Applications , 19(2):261–281, 1976.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Robust Quadratic Programming with Mixed-Integer Uncertainty

Abstract

1 Introduction

Notation:

2 Problem Formulation

Example 1** (Robust Portfolio Optimization).**

Example 2** (Robust Project Crashing).**

3 Copositive Programming Reformulation

3.1 A Copositive Upper Bound on Z(x)Z(\bm{x})Z(x)

Theorem 1** ([11, Theorem 2.6]).**

Lemma 1**.**

Proposition 1**.**

Proof.

Proposition 2**.**

3.2 A Copositive Reformulation of RQP

Theorem 2** (Strong Duality).**

Lemma 2**.**

Proof.

Lemma 3** (Copositive Schur Complements).**

Proof.

Proof of Theorem 2.

Theorem 3**.**

Lemma 4**.**

Proof.

Proof of Theorem 3.

4 Conservative Semidefinite Programming Approximation

4.1 Comparison with the Approximate S\mathcal{S}S-lemma Method

Proposition 3** (Approximate S\mathcal{S}S-lemma Method [4]).**

Proof.

Proposition 4**.**

Proof.

Example 3**.**

5 Extensions

5.1 Two-Stage Robust Quadratic Optimization

Example 4** (Support Vector Machines with Noisy Labels).**

Theorem 4**.**

Proof.

Remark 1**.**

5.2 Robust Quadratically Constrained Quadratic Programming (RQCQP)

5.3 Non-Convex Terms in the Objective Function

6 Numerical Experiments

6.1 Least Squares

6.2 Project Management

6.3 Multi-Item Newsvendor

7 Conclusion

Appendix A Implementation of Least Squares in Section 6.1

Example 1 (Robust Portfolio Optimization).

Example 2 (Robust Project Crashing).

3.1 A Copositive Upper Bound on $Z(\bm{x})$

Theorem 1 ([11, Theorem 2.6]).

Lemma 1.

Proposition 1.

Proposition 2.

Theorem 2 (Strong Duality).

Lemma 2.

Lemma 3 (Copositive Schur Complements).

Theorem 3.

Lemma 4.

4.1 Comparison with the Approximate $\mathcal{S}$ -lemma Method

Proposition 3 (Approximate $\mathcal{S}$ -lemma Method [4]).

Proposition 4.

Example 3.

Example 4 (Support Vector Machines with Noisy Labels).

Theorem 4.

Remark 1.