Small and Strong Formulations for Unions of Convex Sets from the Cayley   Embedding

Juan Pablo Vielma

arXiv:1704.03954·math.OC·March 13, 2018·Math. Program.

Small and Strong Formulations for Unions of Convex Sets from the Cayley Embedding

Juan Pablo Vielma

PDF

TL;DR

This paper introduces a novel geometric technique based on Cayley embeddings to create small, strong mixed-integer programming formulations for unions of convex sets without auxiliary continuous variables, improving computational efficiency.

Contribution

It develops a new geometric approach that generalizes Cayley embeddings, enabling the construction of strong, compact formulations for convex disjunctive constraints without auxiliary variables.

Findings

01

The technique recovers all known strong formulations without auxiliary variables.

02

It produces smaller and stronger formulations for a wide range of disjunctive constraints.

03

The approach inherits geometric properties from the Cayley embedding, ensuring robustness.

Abstract

There is often a significant trade-off between formulation strength and size in mixed integer programming (MIP). When modeling convex disjunctive constraints (e.g. unions of convex sets), adding auxiliary continuous variables can sometimes help resolve this trade-off. However, standard formulations that use such auxiliary continuous variables can have a worse-than-expected computational effectiveness, which is often attributed precisely to these auxiliary continuous variables. For this reason, there has been considerable interest in constructing strong formulations that do not use continuous auxiliary variables. We introduce a technique to construct formulations without these detrimental continuous auxiliary variables. To develop this technique we introduce a natural non-polyhedral generalization of the Cayley embedding of a family of polytopes and show it inherits many geometric…

Figures12

Click any figure to enlarge with its caption.

Equations170

x \in ⋃_{i = 1}^{k} C^{i} \Leftrightarrow \exists (z, y) \in R^{p} \times Z^{k} s.t. (x, z, y) \in Q .

x \in ⋃_{i = 1}^{k} C^{i} \Leftrightarrow \exists (z, y) \in R^{p} \times Z^{k} s.t. (x, z, y) \in Q .

\tilde{C} = {x \in R^{n} : γ_{\tilde{C} - b} (x - b) \leq 1} .

\tilde{C} = {x \in R^{n} : γ_{\tilde{C} - b} (x - b) \leq 1} .

\displaystyle\gamma_{C^{i}-b^{i}}\left(x^{i}-b^{i}y_{i}\right)\leq y_{i},\quad\forall i\in\left\llbracket k\right\rrbracket,\quad\sum\nolimits_{i=1}^{k}y_{i}

\displaystyle\gamma_{C^{i}-b^{i}}\left(x^{i}-b^{i}y_{i}\right)\leq y_{i},\quad\forall i\in\left\llbracket k\right\rrbracket,\quad\sum\nolimits_{i=1}^{k}y_{i}

\sum_{i = 1}^{k} x^{i}

\gamma_{C^{i}-b^{i}}\left(x-b^{i}\right)\leq\sum\nolimits_{j=1}^{k}M_{i,j}y_{j},\;\forall i\in\left\llbracket k\right\rrbracket,\quad\sum\nolimits_{i=1}^{k}y_{i}=1,\;y\in\left\{0,1\right\}^{k}.

\gamma_{C^{i}-b^{i}}\left(x-b^{i}\right)\leq\sum\nolimits_{j=1}^{k}M_{i,j}y_{j},\;\forall i\in\left\llbracket k\right\rrbracket,\quad\sum\nolimits_{i=1}^{k}y_{i}=1,\;y\in\left\{0,1\right\}^{k}.

C^{1}

C^{1}

∥ (2 y, s_{1} x_{1} - s_{2} x_{2}) ∥_{2} \leq 4 y - s_{1} x_{1} - s_{2} x_{2} \forall s \in {- 1, 1}^{2}

∥ (2 y, s_{1} x_{1} - s_{2} x_{2}) ∥_{2} \leq 4 y - s_{1} x_{1} - s_{2} x_{2} \forall s \in {- 1, 1}^{2}

(2 y_{1}, s_{1} x_{1}^{1} - s_{2} x_{2}^{1})_{2}

(2 y_{1}, s_{1} x_{1}^{1} - s_{2} x_{2}^{1})_{2}

- (5/4) y_{2} \leq x_{j}^{2}

x^{1} + x^{2} = x, y_{1} + y_{2} = 1, y

∥ (2 (y_{1} + M_{1, 2} y_{2}), s_{1} x_{1} - s_{2} x_{2}) ∥_{2}

∥ (2 (y_{1} + M_{1, 2} y_{2}), s_{1} x_{1} - s_{2} x_{2}) ∥_{2}

- s_{1} x_{1} - s_{2} x_{2}

- M_{2, 1} y_{1} - (5/4) y_{2} \leq x_{j}

y_{1} + y_{2} = 1, y

\gamma_{C^{i}-b^{i}}\left(\left[x-b^{i}y_{i}\right]_{J_{i}}\right)\leq y_{i}\quad\forall i\in\left\llbracket k\right\rrbracket,\quad\sum\nolimits_{i=1}^{k}y_{i}=1,\quad y\in\left\{0,1\right\}^{k}.

\gamma_{C^{i}-b^{i}}\left(\left[x-b^{i}y_{i}\right]_{J_{i}}\right)\leq y_{i}\quad\forall i\in\left\llbracket k\right\rrbracket,\quad\sum\nolimits_{i=1}^{k}y_{i}=1,\quad y\in\left\{0,1\right\}^{k}.

γ_{G^{i} - b^{i}} ([x - l^{1} y_{1} - u^{2} y_{2}]_{J})

γ_{G^{i} - b^{i}} ([x - l^{1} y_{1} - u^{2} y_{2}]_{J})

y_{1} l_{j}^{1} + y_{2} l_{j}^{2} \leq x_{j}

y_{1} + y_{2} = 1, y

\gamma_{C}\left(x-\sum\nolimits_{i=1}^{k}y_{i}b^{i}\right)\leq\sum\nolimits_{i=1}^{k}r_{i}y_{i},\quad\sum\nolimits_{i=1}^{k}y_{i}=1,\quad y_{i}\geq 0\quad\forall i\in\left\llbracket k\right\rrbracket.

\gamma_{C}\left(x-\sum\nolimits_{i=1}^{k}y_{i}b^{i}\right)\leq\sum\nolimits_{i=1}^{k}r_{i}y_{i},\quad\sum\nolimits_{i=1}^{k}y_{i}=1,\quad y_{i}\geq 0\quad\forall i\in\left\llbracket k\right\rrbracket.

((G^{i} \cap K^{i}) - K^{i}) \cap K^{i} = G^{i} \cap K^{i}

((G^{i} \cap K^{i}) - K^{i}) \cap K^{i} = G^{i} \cap K^{i}

γ_{G^{i}} (\sum_{j \in J_{i}} u^{i, j} (u^{i, j} \cdot x - \sum_{l = 1}^{k} \overline{b}_{j}^{i, l} y_{l})^{+})

γ_{G^{i}} (\sum_{j \in J_{i}} u^{i, j} (u^{i, j} \cdot x - \sum_{l = 1}^{k} \overline{b}_{j}^{i, l} y_{l})^{+})

t_{j} v^{j} \cdot x - \sum_{l = 1}^{k} \underline{b}_{j}^{l} y_{l}

\sum_{i = 1}^{k} y_{i} = 1, y_{i}

aff (Q (C)) = {(x, y) \in R^{n + k} : \sum_{i = 1}^{k} y_{i} = 1, A x = \sum_{i = 1}^{k} A b^{i} y_{i}} .

aff (Q (C)) = {(x, y) \in R^{n + k} : \sum_{i = 1}^{k} y_{i} = 1, A x = \sum_{i = 1}^{k} A b^{i} y_{i}} .

⋃_{u \in U (C)} conv (⋃_{i = 1}^{k} F_{C^{i}} (u) \times {e^{i}}) = ⋃_{X \in N (C)} Q (X)

⋃_{u \in U (C)} conv (⋃_{i = 1}^{k} F_{C^{i}} (u) \times {e^{i}}) = ⋃_{X \in N (C)} Q (X)

⋃_{i = 1}^{k} conv (⋃_{j \neq = i} C^{j} \times {e^{j}}) = ⋃_{i = 1}^{k} {(x, y) \in Q (C) : y_{i} = 0} .

⋃_{i = 1}^{k} conv (⋃_{j \neq = i} C^{j} \times {e^{j}}) = ⋃_{i = 1}^{k} {(x, y) \in Q (C) : y_{i} = 0} .

C^{i}

C^{i}

\displaystyle\forall u\in U\quad\exists j\in\left\llbracket m\right\rrbracket\text{ s.t. }\sigma_{C^{i}}\left(u\right)

σ_{Q (C)} (u, v) = max_{i = 1}^{k} σ_{C^{i}} (u) + v \cdot e^{i} .

σ_{Q (C)} (u, v) = max_{i = 1}^{k} σ_{C^{i}} (u) + v \cdot e^{i} .

\gamma_{C^{j,0}}\left(x-\sum_{i=1}^{k}y_{i}b^{j,i}\right)\leq\sum_{i=1}^{k}r^{j}_{i}y_{i}\;\forall j\in\left\llbracket m\right\rrbracket,\quad\sum_{i=1}^{k}y_{i}=1,\quad y\in\left\{0,1\right\}^{k}.

\gamma_{C^{j,0}}\left(x-\sum_{i=1}^{k}y_{i}b^{j,i}\right)\leq\sum_{i=1}^{k}r^{j}_{i}y_{i}\;\forall j\in\left\llbracket m\right\rrbracket,\quad\sum_{i=1}^{k}y_{i}=1,\quad y\in\left\{0,1\right\}^{k}.

\operatorname{epi}\left(\gamma_{C^{s,0}}\right)=\left\{\left(x,y\right)\in\mathbb{R}^{3}\,:\,\begin{aligned} \left\lVert\left(2y,\;s_{1}x_{1}-s_{2}x_{2}\right)\right\rVert_{2}&\leq 4y-s_{1}x_{2}-s_{2}x_{2},\\ s_{j}x_{j}&\leq(3/2)y\quad\forall j\in\left\llbracket 2\right\rrbracket\end{aligned}\right\}.

\operatorname{epi}\left(\gamma_{C^{s,0}}\right)=\left\{\left(x,y\right)\in\mathbb{R}^{3}\,:\,\begin{aligned} \left\lVert\left(2y,\;s_{1}x_{1}-s_{2}x_{2}\right)\right\rVert_{2}&\leq 4y-s_{1}x_{2}-s_{2}x_{2},\\ s_{j}x_{j}&\leq(3/2)y\quad\forall j\in\left\llbracket 2\right\rrbracket\end{aligned}\right\}.

∥ (2 y_{1}, s_{1} x_{1} - s_{2} x_{2}) ∥_{2}

∥ (2 y_{1}, s_{1} x_{1} - s_{2} x_{2}) ∥_{2}

s_{j} x_{j}

y_{1} + y_{2}

\forall B\in\mathcal{B}\quad\left(\bar{x}\left(B,b^{i}\right)\in P^{i}\quad\forall i\in\left\llbracket k\right\rrbracket\right)\quad\vee\quad\left(\bar{x}\left(B,b^{i}\right)\notin P^{i}\quad\forall i\in\left\llbracket k\right\rrbracket\right),

\forall B\in\mathcal{B}\quad\left(\bar{x}\left(B,b^{i}\right)\in P^{i}\quad\forall i\in\left\llbracket k\right\rrbracket\right)\quad\vee\quad\left(\bar{x}\left(B,b^{i}\right)\notin P^{i}\quad\forall i\in\left\llbracket k\right\rrbracket\right),

A x \leq \sum_{i = 1}^{k} b^{i} y_{i}, \sum_{i = 1}^{k} y_{i} = 1, y \in {0, 1}^{k} .

A x \leq \sum_{i = 1}^{k} b^{i} y_{i}, \sum_{i = 1}^{k} y_{i} = 1, y \in {0, 1}^{k} .

\max\left\{c\cdot x\,:\,x\in P\left(B,b^{i}\right)\right\}=\max\left\{c\cdot x\,:\,x\in P^{i}\right\}\quad\forall i\in\left\llbracket k\right\rrbracket,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

∎

11institutetext: Sloan School of Management, Massachusetts Institute of Technology, Cambridge MA 02139, USA,

11email: [email protected]

Small and Strong Formulations for Unions of Convex Sets from the Cayley Embedding

Juan Pablo Vielma

(Received: date / Accepted: date)

Abstract

There is often a significant trade-off between formulation strength and size in mixed integer programming (MIP). When modeling convex disjunctive constraints (e.g. unions of convex sets), adding auxiliary continuous variables can sometimes help resolve this trade-off. However, standard formulations that use such auxiliary continuous variables can have a worse-than-expected computational effectiveness, which is often attributed precisely to these auxiliary continuous variables. For this reason, there has been considerable interest in constructing strong formulations that do not use continuous auxiliary variables. We introduce a technique to construct formulations without these detrimental continuous auxiliary variables. To develop this technique we introduce a natural non-polyhedral generalization of the Cayley embedding of a family of polytopes and show it inherits many geometric properties of the original embedding. We then show how the associated formulation technique can be used to construct small and strong formulation for a wide range of disjunctive constraints. In particular, we show it can recover and generalize all known strong formulations without continuous auxiliary variables.

Keywords:

Mixed integer nonlinear programming; Mixed integer programming formulations; Disjunctive constraints

1 Introduction

Mixed integer programming (MIP) adds integrality requirements to a continuous optimization problem, which is often referred to as the continuous relaxation of the MIP problem. MIP problems with a convex continuous relaxation often arise from the need to model disjunctive constraint of the form $x\in\bigcup\nolimits_{i=1}^{k}C^{i}$ where $\{C^{i}\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ is a family of closed convex sets. The two main classes of formulations for these constraints are the so-called Big-M and convex hull formulations. Big-M formulations are simple and small, but their continuous relaxations usually yield weak bounds, which can hinder the performance of branch-and-bound based algorithms. In contrast, the convex hull formulation yields the best possible relaxation bounds for a single disjunctive constraint and normally yields strong bounds for problems with multiple constraints. Unfortunately, while convex hull formulations are only moderately larger than Big-M formulations, their computational performance is usually much worse. The folklore attributes this poor performance to certain continuous auxiliary variables used by the convex hull formulation. This has prompted significant interest on techniques to project out such variables (i.e. eliminate them without decreasing the formulation’s strength). The resulting formulations can provide a significant computational advantage, but existing techniques are limited to very specific structures (e.g. see balas88 ; blair90 ; jeroslow88 ; Embedding ; DBLP:journals/mp/VielmaN11 for polyhedra and lodi15 ; gunluk2010perspective ; Hijazi ; mohit1 ; mohit2 for non-polyhedral sets). In this paper we introduce a technique to project out the detrimental auxiliary variables for a wide range of disjunctive constraints. In particular, the technique can be used to recover, and generalize all known results that use binary variables that add to one (e.g. excluding the logarithmic formulation from Embedding ; DBLP:journals/mp/VielmaN11 ).

Our technique is based on a geometric characterization of the projection of the convex hull formulation that connects it to a natural non-polyhedral generalization of an object known as the Cayley embedding of a family of polytopes. To obtain this characterization we generalize to the non-polyhedral setting some known properties of the Cayley embedding and use it to obtain a valid formulation of the disjunctive constraint. We then give simple sufficient conditions for this formulation to be equal to the projection of the convex hull formulation. Using these conditions we then recover and generalize all known techniques to project the convex hull formulation. We also provide precise necessary and sufficient conditions to obtain the projection and comment on the practical implementation of the formulations. In particular, we evaluate the representation of the projection from an algebraic geometry perspective.

The paper is organized as follows. In Section 2 we introduce a geometric abstraction that unifies all known formulations in a common framework. In Section 3 we introduce the geometric characterization and describe the projected convex hull formulation for two simple cases. In Section 4 we present the generalized properties of the Cayley embedding and the simple sufficient conditions. We then use these conditions to recover and generalize all existing formulations in Section 5. In particular, we give guidance on how to apply the technique, comment on its practical implementation and present the algebraic geometry result. Finally, in Section 6 we give detailed necessary and sufficient conditions for the technique. Omitted proofs are included in Section 8.

We use the following notation. For a function $f:\mathbb{R}^{n}\to\mathbb{R}\cup\left\{\infty\right\}$ we let its epigraph be $\operatorname{epi}\left(f\right):=\left\{\left(x,z\right)\in\mathbb{R}^{n+1}\,:\,f(x)\leq z\right\}$ . For a set $S\subseteq\mathbb{R}^{n}$ we denote its topological closure, its convex hull, its conic hull and its affine hull by $\operatorname{cl}\left(S\right)$ , $\operatorname{conv}\left(S\right)$ , $\operatorname{cone}\left(S\right)$ and $\operatorname{aff}\left(S\right)$ . For a closed convex set $C\subseteq\mathbb{R}^{n}$ we denote its recession cone by $C_{\infty}$ and the set containing all its extreme points by $\operatorname{ext}\left(C\right)$ . We let $\left\llbracket k\right\rrbracket:=\left\{1,\ldots,k\right\}$ , ${\bf e}^{i}\in\mathbb{R}^{n}$ be the $i$ -th canonical vector, ${\mathbf{1}}\in\mathbb{R}^{n}$ be the all ones vector and ${\mathbf{0}}\in\mathbb{R}^{n}$ be the all zeros vector (the specific dimension will be apparent from the context). For a closed convex cone $K$ we let $K^{*}$ be its polar cone. Finally, we let $\mathbb{Z}$ be the set of integers.

2 MIP formulations for unions of convex sets

Definition 1

Let $\{C^{i}\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ be a finite family of closed convex sets and $Q\subseteq\mathbb{R}^{n+p+k}$ be a closed convex set. We say $\left(x,z,y\right)\in Q,\quad y\in\mathbb{Z}^{k}$ is a MIP formulation of $x\in\bigcup\nolimits_{i=1}^{k}C^{i}$ if and only if

[TABLE]

We refer to $Q$ as the continuous relaxation of the MIP formulation and say the formulation is ideal if and only if for any minimal face $F$ of $Q$ and $\left(x,z,y\right)\in F$ we have $y\in\mathbb{Z}^{k}$ .

Existing formulations depend on specific set-representations. For instance, Balas, Jeroslow and Lowe give linear MIP formulations for polyhedra (e.g. (Mixed-Integer-Linear-Programming-Formulation-Techniques, , Section 5)), Ben-Tal, Helton, Nemirovski and Nie give conic MIP formulations for conic representable sets ben2001lectures ; Helton and Ceria, Merhotra, Soares and Stubs give perspective function formulations for function level sets springerlink:10.1007/s101070050106 ; springerlink:10.1007/s101070050103 . To abstract the representation we use the following properties of the gauge of a convex set (e.g. hiriart-lemarechal-2001 ).

Lemma 1

Let $C\subseteq\mathbb{R}^{n}$ be a closed convex set with ${\bf 0}\in C$ and $\gamma_{C}:\mathbb{R}^{n}\to\mathbb{R}\cup\left\{\infty\right\}$ be the gauge function of $C$ given by $\gamma_{C}\left(x\right):=\inf\left\{\lambda>0\,:\,x\in\lambda C\right\}$ . Then the following properties hold.

•

If $x\notin\lambda C$ for all $\lambda>0$ , then $\gamma_{C}\left(x\right)=\infty$ .

•

The gauge function $\gamma_{C}$ is convex and positively homogeneous.

•

We have that $\left\{x\in\mathbb{R}^{n}\,:\,\gamma_{C}\left(x\right)\leq r\right\}=rC$ and $\left\{x\in\mathbb{R}^{n}\,:\,\gamma_{C}\left(x\right)\leq 0\right\}=C_{\infty}$ .

Furthermore, if $\tilde{C}\subseteq\mathbb{R}^{n}$ is a closed convex set and $b\in\tilde{C}$ , then

[TABLE]

Using gauge functions we can construct a generic versions of standard formulations for convex sets that satisfy the following assumption.

Definition 2

We say $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ if and only if $C^{i}\subseteq\mathbb{R}^{n}$ is a non-empty closed convex set for all $i\in\left\llbracket k\right\rrbracket$ and $C^{i}_{\infty}=C^{j}_{\infty}$ for all $i,j\in\left\llbracket k\right\rrbracket$ .

Theorem 2.1 ()

Let $\left\{b^{i}\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ and $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ be such that $b^{i}\in C^{i}$ for all $i\in\left\llbracket k\right\rrbracket$ , then an ideal formulation for $\bigcup_{i=1}^{k}C^{i}$ is given by

[TABLE]

In particular, if $\operatorname{ext}\left(C^{i}\right)\neq\emptyset$ for all $i\in\left\llbracket k\right\rrbracket$ , then the continuous relaxation of (2) is line-free and all of its extreme points have integral $y$ components.

The proof of Theorem 2.1 is analogous to existing formulations, but for completeness we include a proof in Section 8.1.

A key to obtain the relatively simple and small ideal formulation (2) is the use of the $k$ copies $x^{i}$ of the original variables $x$ . Unfortunately, these variable copies induce a block structure that the MIP folklore identifies as a source of the worse-than-expected computational performance of formulation (2). For this reason, simpler Big-M formulations are often preferred in practice, even though they usually fail to be ideal. We can abstract the specific structure of such Big-M formulations using gauge functions as follows.

Theorem 2.2 ()

Let $\left\{b^{i}\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ and $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ be such that $b^{i}\in C^{i}$ for all $i\in\left\llbracket k\right\rrbracket$ , and $M\in\mathbb{R}^{k\times k}$ be such that $M_{i,i}=1$ for all $i\in\left\llbracket k\right\rrbracket$ and $C^{j}\subseteq\left\{x\in\mathbb{R}^{n}\,:\,\gamma_{C^{i}-b^{i}}\left(x-b^{i}\right)\leq M_{i,j}\right\}$ for all $i,j\in\left\llbracket k\right\rrbracket$ . Then a formulation for $\bigcup_{i=1}^{k}C^{i}$ is given by

[TABLE]

The strength of this formulation depended on $M$ with the strongest formulation being obtained for the smallest valid coefficients.

The abstraction provided by the use of gauge functions allow us to focus on the geometric structure of the formulations. However, it does not provide an explicit representation of the formulations that can be easily fed to a MIP solver. Fortunately, we can use known properties of gauge functions to obtain practical representation of various classes and recover existing formulations.

Lemma 2 ()

Let $C\subseteq\mathbb{R}^{n}$ be closed and convex with ${\bf 0}\in C$ and $E\subseteq\mathbb{R}^{n}\times\mathbb{R}_{+}$ be a closed convex cone. Then $E=\operatorname{epi}\left(\gamma_{C}\right)$ if and only if $C=\left\{x\in\mathbb{R}^{n}\,:\,\left(x,1\right)\in E\right\}$ and $C_{\infty}=\left\{x\in\mathbb{R}^{n}\,:\,\left(x,0\right)\in E\right\}$ . In particular,

If $C:=\left\{x\in\mathbb{R}^{n}\,:\,\exists z\in\mathbb{R}^{p}\text{ s.t. }Ax+Bz+c\in K\right\}$ for a closed convex cone $K\subseteq\mathbb{R}^{m}$ , matrices $A\in\mathbb{R}^{m\times n}$ and $B\in\mathbb{R}^{m\times p}$ , and vector $c\in\mathbb{R}^{m}$ , then $\operatorname{epi}\left(\gamma_{C}\right)=\left\{\left(x,y\right)\in\mathbb{R}^{n+1}\,:\,\exists z\in\mathbb{R}^{p}\text{ s.t. }Ax+Bz+cy\in K,\quad y\geq 0\right\}$ , 2. 2.

if $f:\mathbb{R}^{n}\to\mathbb{R}\cup\left\{\infty\right\}$ is a closed convex function, $(\operatorname{cl}\tilde{f})(x,y)$ is the closure of the perspective function of $f$ , and $C:=\left\{x\in\mathbb{R}^{d}\,:\,f(x)\leq 0\right\}$ , then $\operatorname{epi}\left(\gamma_{C}\right)=\{(x,y)\in\mathbb{R}^{d}\times\mathbb{R}_{+}\,:\,(\operatorname{cl}\tilde{f})(x,y)\leq 0\}$ , 3. 3.

if $b\in C$ , then $\operatorname{epi}\left(\gamma_{C-b}\right)=\left\{\left(x,y\right)\in\mathbb{R}^{n+1}\,:\,\gamma_{C}\left(x+by\right)\leq y\right\}$ , and 4. 4.

If ${\bf 0}\in C^{1}\cap C^{2}$ then, $\operatorname{epi}\left(\gamma_{C^{1}\cap C^{2}}\right)=\operatorname{epi}\left(\gamma_{C^{1}}\right)\cap\operatorname{epi}\left(\gamma_{C^{2}}\right)$ .

The following example illustrates Lemma 2 and Theorems 2.1 and 2.2.

Example 1

Let $C^{1}=\{x\in\mathbb{R}^{2}\,:\,\left(2-s_{1}x_{1}\right)\left(2-s_{2}x_{2}\right)\geq 1\quad\forall s\in\left\{-1,1\right\}^{2}\}$ and $C^{2}=[-5/4,5/4]^{2}$ be the sets depicted in Figure 1. Using standard conic representability results for the cone $\mathbf{L}^{n}:=\left\{\left(x,x_{0}\right)\in\mathbb{R}^{n+1}\,:\,\left\lVert x\right\rVert_{2}\leq x_{0}\right\}$ (e.g. ben2001lectures ) we have

[TABLE]

Using Lemma 2 we have $\gamma_{C^{1}}\left(x\right)\leq y$ if and only if

[TABLE]

Similarly $\gamma_{C^{2}}\left(x\right)\leq y$ if and only if $-\left(5/4\right)y\leq x_{j}\leq\left(5/4\right)y\quad\forall j\in\left\llbracket 2\right\rrbracket$ . Then Theorem 2.1 yields the ideal formulation for $\bigcup_{i=1}^{2}C^{i}$ given by

[TABLE]

Alternatively, we can use Theorem 2.2 to obtain the formulation given by

[TABLE]

The smallest Big-M values that make this formulation valid are $M_{1,2}=5/4$ and $M_{2,1}=3/2$ . Unfortunately, we can check that for all $t\in(0,1)$ the point $\left(\bar{x}(t),\bar{y}(t)\right)$ with $\bar{x}(t)=\left((5+t)/4,(5/4)(5-t)(t-1)/(3t-5)\right)$ and $\bar{y}(t)=\left(t,1-t\right)$ is an extreme point of the continuous relaxation of (5) with fractional $y$ components. Furthermore, $\bar{x}(t)\notin\operatorname{conv}\left(C^{1}\cup C^{2}\right)$ for all $t\in(0,1)$ .∎

An ideal formulation without the variable copies can be obtained by projecting (2) onto the the $x$ and $y$ variables, but characterizing such projection can be challenging. However, an effective characterization can lead to significant computational improvements lodi15 ; gunluk2010perspective ; Hijazi ; Embedding ; DBLP:journals/mp/VielmaN11 . Unfortunately, there are only few general techniques to obtain these characterizations. One of the most general results by Balas, Blair and Jeroslow balas88 ; blair90 ; jeroslow88 considers unions of polyhedra with a common geometric structure (See Proposition 5.1 in Section 5.1). In contrast, non-polyhedral results require more structure and fall into two classes. The first class considers convex sets contained in orthogonal spaces mohit2 and can be stated in our gauge notation as follows.

Theorem 2.3 (mohit2 )

Let $\left\{b^{i}\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ , $\left\{C^{i}\right\}_{i=1}^{k}$ be a finite family of compact convex sets in $\mathbb{R}^{n}$ and $\left\{J_{i}\right\}_{i=1}^{k}$ be disjoint sets such that $\bigcup_{i=1}^{k}J_{i}=\left\llbracket n\right\rrbracket$ and for all $i\in\left\llbracket k\right\rrbracket$ we have $b^{i}\in C^{i}$ and $C^{i}\subseteq\left\{x\in\mathbb{R}^{n}\,:\,x_{j}=0\quad\forall j\in\left\llbracket n\right\rrbracket\setminus J_{i}\right\}$ . Then an ideal formulation for $x\in\bigcup_{i=1}^{k}C^{i}$ is given by

[TABLE]

where for $a\in\mathbb{R}^{n}$ and $J\subseteq\left\llbracket n\right\rrbracket$ we let $\left[a\right]_{J}\in\mathbb{R}^{n}$ be such that $\left(\left[a\right]_{J}\right)_{j}=a_{j}$ if $j\in J$ and $\left(\left[a\right]_{J}\right)_{j}=0$ otherwise.

The second class considers sets with certain monotonicity properties and generalizes “on/off” constraints gunluk2010perspective ; lodi15 ; Hijazi .

Theorem 2.4 (lodi15 ; Hijazi )

Let $G^{1},G^{2}\subseteq\mathbb{R}^{n}$ be a closed convex sets such that $G^{1}_{\infty}=\mathbb{R}^{n}_{-}$ and $G^{2}_{\infty}=\mathbb{R}^{n}_{+}$ . Furthermore, for each $i\in\left\llbracket 2\right\rrbracket$ , let $l^{i},u^{i}\in\mathbb{R}^{n}$ and $C^{i}:=\left\{x\in G^{i}\,:\,l_{j}^{i}\leq x_{j}\leq u_{j}^{i}\quad\forall j\in\left\llbracket n\right\rrbracket\right\}$ be such that $l^{i}_{j}=\min\left\{x_{j}\,:\,x\in C^{i}\right\}$ and $u^{i}_{j}=\max\left\{x_{j}\,:\,x\in C^{i}\right\}$ for all $j\in\left\llbracket n\right\rrbracket$ , $b^{1}=l^{1}$ and $b^{2}=u^{2}$ . Then an ideal formulation for $x\in C^{1}\cup C^{2}$ is given by

[TABLE]

The most general known version of this result (e.g. Theorem 4 in lodi15 ) is obtained by combining Theorem 2.4 with Lemma 2 and noting that the result is still valid if we flip or mirror the axes of the $x$ variables. In fact, Theorems 2.4 and 2.3 can also be easily extended further by combining formulation (7) and any orthogonal transformation of the $x$ variables (i.e. axis flip plus rotation).

3 Ideal Formulations without Variable Copies

To construct the projection of formulation (2) onto the $x$ and $y$ variables we use a geometric characterization introduced in Embedding for the polyhedral setting. This characterization is based on the Cayley trick or Cayley Embedding, which is used to study Minkowski sums of polyhedra (e.g. caytrick ; karavelas2013maximum ; WeibelPhd ). The characterization in Embedding uses a generalization of the Cayley Embedding to consider alternative uses of [math]- $1$ variables (beyond the $k$ variables $y_{i}$ that add to one used in (2)). However, for simplicity we only generalize the standard version to the non-polyhedral setting through the following result we prove in Section 8.1.

Proposition 1 ()

Let $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ and $Q\left(\mathcal{C}\right):=\operatorname{conv}\left(\bigcup\nolimits_{i=1}^{k}C^{i}\times\left\{{\bf e}^{i}\right\}\right)$ , where ${\bf e}^{i}$ is the i-th $k$ -dimensional unit vector. Then

$Q\left(\mathcal{C}\right)$ * is a closed convex set and $Q\left(\mathcal{C}\right)_{\infty}=\left\{\left(x,y\right)\in\mathbb{R}^{n+k}\,:\,x\in C^{1}_{\infty},\;y=0\right\}$ ,* 2. 2.

$Q\left(\mathcal{C}\right)$ * is the projection of the continuous relaxation of (2), and* 3. 3.

$\left(x,y\right)\in Q\left(\mathcal{C}\right),\quad y\in\mathbb{Z}^{k}$ * is an ideal formulation of $x\in\bigcup_{i=1}^{k}C^{i}$ .*

Proposition 1 reduces the construction of an ideal formulation to that of the convex hull defining $Q\left(\mathcal{C}\right)$ , which can be as challenging as the projection of (2). Fortunately, as illustrated in the following propositions, it can sometimes be easily constructed for special structures. The first structure we consider is nearly-homothetic sets that are almost translations and scalings of one another (we replace the scaling by [math] with the common recession cone of the sets).

Proposition 2

Let $C\subseteq\mathbb{R}^{n}$ be a closed convex set such that ${\bf 0}\in C$ , $\left\{b^{i}\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ and $r\in\mathbb{R}^{k}_{+}\setminus\left\{{\bf 0}\right\}$ . If $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}$ is such that $C^{i}=r_{i}C+b^{i}+C_{\infty}$ for each $i\in\left\llbracket k\right\rrbracket$ , then $\left(x,y\right)\in Q\left(\mathcal{C}\right)$ if and only if

[TABLE]

Proof

Let $Q$ be the set of points that satisfy (8). Then $Q$ is convex and $C^{i}\times\left\{{\bf e}^{i}\right\}\subseteq Q$ for all $i\in\left\llbracket k\right\rrbracket$ , so we have $Q\left(\mathcal{C}\right)\subseteq Q$ . Finally, if $\left(x,y\right)\in Q$ , then $x\in\left(\sum_{i=1}^{k}y_{i}r_{i}\right)C+\sum_{i=1}^{k}y_{i}b^{i}\subseteq\left(\sum_{i=1}^{k}y_{i}r_{i}\right)C+\sum_{i=1}^{k}y_{i}b^{i}+C_{\infty}=\sum_{i=1}^{k}y_{i}\left(r_{i}C+b^{i}+C_{\infty}\right)$ and hence $\left(x,y\right)\in Q\left(\mathcal{C}\right)$ . ∎

The second structure we consider is a technical generalization of Theorem 2.3 that we later use to generalize Theorem 2.4. This generalization relaxes the orthogonality requirement of Theorem 2.3 by allowing sets that are the Minkowski sum of the orthogonal sets and the non-negative orthant. This requires adding technical restriction (9) on the sets, which generalizes the monotonicity condition of Theorem 2.4. We discuss this condition further in Section 5.4. Finally, we explicitly consider the possible orthogonal transformation we have previously alluded to, and which we represent through an orthonormal basis. This last step allows for a more direct practical application of the result, but makes the proof more technical so we postpone it to Section 8.2.

Proposition 3

Let $\left\{G^{i}\right\}_{i=1}^{k}$ be closed convex sets in $\mathbb{R}^{n}$ such that ${\bf 0}\in G^{i}$ for all $i\in\left\llbracket k\right\rrbracket$ , $\left\{v^{j}\right\}_{j=1}^{n}\subseteq\mathbb{R}^{n}$ be an orthonormal basis of $\mathbb{R}^{n}$ , and $\left\{J_{i}\right\}_{i=1}^{k}$ be disjoint sets such that $\left\llbracket n\right\rrbracket=\bigcup_{i=1}^{k}J_{i}$ , $\left\{b^{i}\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ , $t\in\left\{-1,1\right\}^{n}$ and $M=\operatorname{cone}\left(\left\{t_{j}v^{j}\right\}_{j=1}^{n}\right)$ . Finally, let $\left\{s^{i}\right\}_{i=1}^{k}\subseteq\left\{-1,0,1\right\}^{n}$ be such that for each $i\in\left\llbracket k\right\rrbracket$ we have $s^{i}_{j}=0$ for all $j\notin J_{i}$ , $K^{i}=\operatorname{cone}\left(\left\{s^{i}_{j}v^{j}\right\}_{j=1}^{n}\right)$ and $C^{i}:=b^{i}+G^{i}\cap K^{i}+M$ . If for all $i\in\left\llbracket k\right\rrbracket$ we have

[TABLE]

and $G^{i}\cap K^{i}$ is compact, then $\left(x,y\right)\in Q\left(\mathcal{C}\right)$ if and only if

[TABLE]

where $\left(a\right)^{+}=\max\left\{0,a\right\}$ for any $a\in\mathbb{R}$ and for all $i,l\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket n\right\rrbracket$ we let $u^{i,j}=\left(-s_{j}^{i}t_{j}\right)^{+}s_{j}^{i}v^{j}$ , $\underline{b}^{i}_{j}=\min\left\{t_{j}v^{j}\cdot x\,:\,x\in b^{i}+G^{i}\cap K^{i}\right\}$ and $\overline{b}^{i,l}_{j}=\max\left\{s_{j}^{i}v^{j}\cdot x\,:\,x\in b^{l}+G^{l}\cap K^{l}\right\}$ if $i\neq l$ and $\overline{b}^{i,i}_{j}=u^{i,j}\cdot b^{i}$ .

4 Boundary Structure of the Cayley Embedding

To characterize $Q\left(\mathcal{C}\right)$ for more complicated unions we will use the special structure of its boundary. It is known that if all $C^{i}$ are polytopes, then every face of $Q\left(\mathcal{C}\right)$ is of the form $\operatorname{conv}\left(\bigcup_{i=1}^{n}F^{i}\times\left\{{\bf e}^{i}\right\}\right)$ where the $F^{i}$ are faces of $C^{i}$ whose normals intersect caytrick ; karavelas2013maximum ; WeibelPhd . We generalize this result beyond polyhedra using standard properties of the boundary of a closed convex set (e.g. hiriart-lemarechal-2001 ).

Definition 3

The support function of $S\subseteq\mathbb{R}^{n}$ is the function $\sigma_{S}:\mathbb{R}^{n}\to\mathbb{R}\cup\left\{\infty\right\}$ defined by $\sigma_{S}\left(d\right):=\sup\left\{d\cdot x\,:\,x\in S\right\}$ . The domain of $\sigma_{S}$ is the set $\operatorname{dom}\left(\sigma_{S}\right):=\left\{d\in\mathbb{R}^{n}\,:\,\sigma_{S}\left(d\right)<\infty\right\}$ .

For a closed convex set $C\subseteq\mathbb{R}^{n}$ we denote its boundary by $\operatorname{bd}\left(C\right)=C\setminus\operatorname{int}\left(C\right)$ , its relative boundary by $\operatorname{rbd}\left(C\right)=C\setminus\operatorname{ri}\left(C\right)$ , its affine hull by $\operatorname{aff}\left(C\right)$ and the linear subspace parallel to $\operatorname{aff}\left(C\right)$ by $L\left(C\right)$ .

The face of $C$ exposed by $d\in\mathbb{R}^{n}$ is $F_{C}\left(d\right):=\left\{x\in C\,:\,d\cdot x=\sigma_{C}\left(d\right)\right\}$ and its normal cone at $x\in\operatorname{bd}\left(C\right)$ is $N_{C}(x):=\left\{d\in\mathbb{R}^{n}\,:\,d\cdot(y-x)\leq 0\quad\forall y\in C\right\}$ . The tangent cone $T_{C}(x)$ to $C$ at $x\in\operatorname{bd}\left(C\right)$ is the polar of $N_{C}(x)$ .

Proposition 4 ()

Let $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ , $\left\{b^{i}\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ be such that $b^{i}\in C^{i}$ for all $i\in\left\llbracket k\right\rrbracket$ and $A\in\mathbb{R}^{r\times n}$ be such that $L\left(\mathcal{C}\right):=\sum_{i=1}^{k}L\left(C^{i}\right)=\left\{x\in\mathbb{R}^{n}\,:\,Ax={\bf 0}\right\}$ . Then

[TABLE]

In addition, let $\mathcal{U}\left(\mathcal{C}\right):=\left\{u\in L\left(\mathcal{C}\right)\setminus\left\{\bf 0\right\}\,:\,F_{C^{i}}(u)\neq\emptyset\quad\forall i\in\left\llbracket k\right\rrbracket\right\}$ , $N\left(\mathcal{C}\right):=\left\{\left({x}^{i}\right)_{i=1}^{k}\in\operatorname*{\scalerel*{\times}{\sum}}\nolimits_{i=1}^{k}\operatorname{bd}\left(C^{i}\right)\,:\,L\left(\mathcal{C}\right)\cap\bigcap\nolimits_{i=1}^{k}N_{C^{i}}\left({x}^{i}\right)\neq\left\{\bf 0\right\}\right\}$ , and for each $\mathcal{X}:=\left({x}^{i}\right)_{i=1}^{k}\in N\left(\mathcal{C}\right)$ let $Q\left(\mathcal{X}\right):=\operatorname{conv}\left(\bigcup_{i=1}^{n}\left\{{x}^{i}\right\}\times\left\{{\bf e}^{i}\right\}\right)$ . Then $\operatorname{rbd}\left(Q\left(\mathcal{C}\right)\right)$ is equal to the union of

[TABLE]

and

[TABLE]

We postpone the proof of Proposition 4 to Section 8.3 and instead illustrate it in the following example.

Example 2

Let $C^{1}=\left\{x\in\mathbb{R}^{2}\,:\,\left(2-v_{1}x_{1}\right)\left(2-v_{2}x_{2}\right)\geq 1\quad\forall v\in\left\{-1,1\right\}^{2}\right\}$ and $C^{2}=[-5/4,5/4]^{2}$ be the sets from Example 1 depicted in Figures 1 and 2(b). Let $\hat{x}^{2}=(1,-1)\in\operatorname{bd}\left(C^{2}\right)$ and $\hat{B}^{1}:=\left\{x\in C^{1}\,:\,\left(2-x_{1}\right)\left(2+x_{2}\right)=1\right\}\subseteq\operatorname{bd}\left(C^{1}\right)$ be the boundary subsets highlighted in black in Figure 2(b) (the range of their normals are depicted by dashed arrows). Then $N_{C^{1}}\left({x}^{1}\right)\cap N_{C^{2}}\left(\hat{x}^{2}\right)\neq\emptyset$ if and only if ${x}^{1}\in\hat{B}^{1}$ and hence by Proposition 4 we have that $\hat{B}:=\bigcup_{{x}^{1}\in\hat{B}^{1}}\operatorname{conv}\left(\left(\left\{{x}^{1}\right\}\times\left\{{\bf e}^{1}\right\}\right)\cup\left(\left\{\hat{x}^{2}\right\}\times\left\{{\bf e}^{2}\right\}\right)\right)\subseteq\operatorname{rbd}\left(Q\left(\mathcal{C}\right)\right)$ . This is illustrated in Figure 2(a) were we use the fact that $y_{1}+y_{2}=1$ for all $\left(x,y\right)\in Q\left(\mathcal{C}\right)$ to eliminate $y_{2}$ and depict $Q\left(\mathcal{C}\right)$ three dimensions. In Figure 2(a) the representations (i.e. after eliminating $y_{2}$ ) of $\hat{B}^{1}\times\left\{{\bf e}^{1}\right\}$ and $\left\{\hat{x}^{2}\right\}\times\left\{{\bf e}^{2}\right\}$ are highlighted in black and $\hat{B}$ corresponds to the meshed surface. This surface is an example of a portion of the boundary of $Q\left(\mathcal{C}\right)$ considered in (11). We obtain another example of this portion if we let $\tilde{x}^{1}:=\left(0,-3/2\right)\in\operatorname{bd}\left(C^{1}\right)$ and $B^{2}:=\left\{x\in C^{2}\,:\,x_{2}=-5/4\right\}\subseteq\operatorname{bd}\left(C^{2}\right)$ be the boundary subsets highlighted in white in Figure 2(b), for which $N_{C^{1}}\left(\tilde{x}^{1}\right)\cap N_{C^{2}}\left(x^{2}\right)\neq\emptyset$ if and only if $x^{2}\in B^{2}$ , and $\tilde{B}:=\operatorname{conv}\left(\left(\left\{\tilde{x}^{1}\right\}\times\left\{{\bf e}^{1}\right\}\right)\cup\left(B^{2}\times\left\{{\bf e}^{2}\right\}\right)\right)\subseteq\operatorname{rbd}\left(Q\left(\mathcal{C}\right)\right)$ . An example of a portion considered in (12) is simply $C^{1}\times\left\{{\bf e}^{1}\right\}$ whose representation is depicted by the dotted surface in Figure 2(a). ∎

Example 2 illustrates how the characterization of $\operatorname{bd}\left(Q\left(\mathcal{C}\right)\right)$ from Proposition 4 can be turned into a piecewise description composed of a finite number of sets (e.g. $\hat{B}$ , $\tilde{B}$ , $C^{1}\times\left\{{\bf e}^{1}\right\}$ , etc.). All sets associated to (12) have simple explicit descriptions that yield trivial valid inequalities for $Q\left(\mathcal{C}\right)$ (e.g. $C^{1}\times\left\{{\bf e}^{1}\right\}$ yields $y_{1}\leq 1$ or equivalently $y_{2}\geq 0$ ). In contrast, the sets associated (11) yield non-trivial valid inequalities, but do not always have clear explicit descriptions (e.g. $\tilde{B}$ yields $-x_{2}\leq(3/2)y_{1}+(5/4)(1-y_{1})$ , but the non-linear inequality associated to $\hat{B}$ is harder to describe). Fortunately, it is sometimes possible to directly obtain a finite piecewise description of $Q\left(\mathcal{C}\right)$ . The first step is to describe $Q\left(\mathcal{C}\right)$ as a finite intersection of similar sets, but with known descriptions.

Proposition 5

Let $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ , $\mathcal{C}^{j}:=\left\{C^{j,i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ for each $j\in\left\llbracket m\right\rrbracket$ and $U=\bigcap_{i=1}^{k}\operatorname{dom}\left(\sigma_{C^{i}}\right)\setminus\left\{{\bf 0}\right\}$ or $U=\mathbb{R}^{n}$ . If

[TABLE]

then $Q\left(\mathcal{C}\right)=\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ .

Proof

Let $Q=\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ . Condition (13a) implies $Q\left(\mathcal{C}\right)\subseteq Q$ . For $Q\subseteq Q\left(\mathcal{C}\right)$ we show $\sigma_{Q}\left(u,v\right)\leq\sigma_{Q\left(\mathcal{C}\right)}\left(u,v\right)$ . If $\sigma_{Q\left(\mathcal{C}\right)}\left(u,v\right)=\infty$ this holds trivially, so we assume $\left(u,v\right)\in\operatorname{dom}\left(\sigma_{Q\left(\mathcal{C}\right)}\right)$ . By Theorem 3.3.2 in hiriart-lemarechal-2001 we have

[TABLE]

Then $\operatorname{dom}\left(\sigma_{Q\left(\mathcal{C}\right)}\right)=\left(\bigcap_{i=1}^{k}\operatorname{dom}\left(\sigma_{C^{i}}\right)\right)\times\mathbb{R}^{k}$ and $u\in\operatorname{dom}\left(\sigma_{C^{i}}\right)$ for all $i\in\left\llbracket k\right\rrbracket$ . If $u=0$ , then $\sigma_{Q}\left(u,v\right)=\sigma_{Q\left(\mathcal{C}\right)}\left(u,v\right)=\max_{i=1}^{k}v_{i}$ . For $u\neq 0$ let $j\in\left\llbracket m\right\rrbracket$ be the index from condition (13b). Combining this condition with (14) and Theorem 3.3.2 in hiriart-lemarechal-2001 for $Q\left(\mathcal{C}^{j}\right)$ we finally have $\sigma_{Q}\left(u,v\right)\leq\sigma_{Q\left(\mathcal{C}^{j}\right)}\left(u,v\right)=\max\nolimits_{i=1}^{k}\sigma_{C^{j,i}}\left(u\right)+v\cdot e^{i}=\sigma_{Q\left(\mathcal{C}\right)}\left(u,v\right)$ . ∎

The second step is to combine Proposition 5 with the known descriptions of the $Q\left(\mathcal{C}^{j}\right)$ . For instance, below we combine it with Proposition 2.

Corollary 1

Let $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ and for each $j\in\left\llbracket m\right\rrbracket$ let $C^{j,0}\subseteq\mathbb{R}^{n}$ with ${\bf 0}\in C^{j,0}$ , $\left\{b^{j,i}\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n}$ , $r^{j}\in\mathbb{R}^{k}_{+}\setminus\left\{0\right\}$ and $\mathcal{C}^{j}:=\left\{C^{j,i}\right\}_{i=1}^{k}$ be such that $C^{j,i}=r^{j}_{i}C^{j,0}+b^{j,i}+C^{j,0}_{\infty}$ for all $i\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket m\right\rrbracket$ . If (13) holds for $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}$ , then an ideal formulation for $x\in\bigcup_{i=1}^{k}C^{i}$ is given by

[TABLE]

Example 3

Let $C^{1}$ and $C^{2}$ again be the sets from Example 1 depicted in Figures 1 and 2(b). To construct an ideal formulation for $x\in C^{1}\cup C^{2}$ we divide directions $u\in\mathbb{R}^{2}\setminus\left\{0\right\}$ for condition (13b) into four classes. For each $s\in\left\{-1,1\right\}^{2}$ let $C^{s,1}:=\left\{x\in\mathbb{R}^{2}\,:\,\left(2-s_{1}x_{1}\right)\left(2-s_{2}x_{2}\right)\geq 1,\;s_{j}x_{j}\leq 3/2\;\forall i\in\left\llbracket 2\right\rrbracket\right\}$ , $C^{s,2}:=\left\{x\in\mathbb{R}^{2}\,:\,s_{j}x_{j}\leq 5/4\quad\forall i\in\left\llbracket 2\right\rrbracket\right\}$ and $D_{s}:=\left\{u\in\mathbb{R}^{2}\,:\,s_{1}x_{1}\geq 0,\quad s_{2}x_{2}\geq 0\right\}$ . For $s=(1,-1)$ , Figure 2(b) depicts $C^{s,1}$ and $C^{s,2}$ in light gray and illustrates how condition (13) is satisfied: for each $i\in\left\llbracket 2\right\rrbracket$ , $s\in\left\{-1,1\right\}$ and $u\in D_{s}$ we have $\sigma_{C^{i}}\left(u\right)=\sigma_{C^{s,i}}\left(u\right)$ and $C^{i}\subseteq C^{s,i}$ . Finally, if we let $C^{s,0}=C^{s,1}$ , $r^{s}_{1}=1$ , $b^{s,1}=(0,0)^{T}$ , $r^{s}_{1}=0$ and $b^{s,1}=\left(s_{1}(5/4),s_{2}(5/4)\right)^{T}$ we have $C^{s,i}=r^{s}_{i}C^{s,0}+b^{s,i}+C^{s,0}_{\infty}$ for all $i\in\left\llbracket 2\right\rrbracket$ and

[TABLE]

Then (15) yields the ideal formulation of $x\in C^{1}\cup C^{2}$ given by

[TABLE]

where we used the fact that $s_{1}^{2}=s_{2}^{2}=1$ for all $s\in\left\{-1,1\right\}^{2}$ to simplify the nonlinear inequalities in $x$ and $y$ .∎

Note that the key to effectively satisfy condition (13b) was to include $s_{j}x_{j}\leq 3/2$ in the definition of $C^{s,1}$ . Indeed, as can be glimpsed from Figure 2(b) if we omitted these constraints for $s=(1,-1)$ , we would have $\sigma_{C^{1}}\left(0,-1\right)=3/2<2=\sigma_{C^{s,1}}\left(0,-1\right)$ . Another way to understand the need for these inequalities is by noting that for $s=(1,-1)$ they ensure that $N_{C^{s,1}}\left(\tilde{x}^{1}\right)\cap N_{C^{s,2}}\left(x^{2}\right)\neq\emptyset$ for $\tilde{x}^{1}:=\left(0,-3/2\right)$ and all $x^{2}\in B^{2}:=\left\{x\in C^{2}\,:\,x_{2}=-5/4\right\}$ (cf. white boundary subsets depicted in Figure 2(b) and discussed in Example 2). This last observation can be useful to construct families $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}$ that satisfy condition (13b) (and verify that they do satisfy it) so we formalize it in Corollary 3 of Section 6. However, we first showcase some important applications where (13b) can be easily verified.

5 Applications of Proposition 5

While Proposition 5 and Corollary 1 are simple, together with Proposition 3 they can recover and generalize all known results from the literature.

5.1 Unions of Polyhedra

The first result that Corollary 1 can generalize is the following class of formulations introduced by Balas, Blair and Jeroslow balas88 ; blair90 ; jeroslow88 .

Definition 4

For any $A\in\mathbb{R}^{m\times n}$ and $B\subseteq\left\llbracket m\right\rrbracket$ let $A_{B}\in\mathbb{R}^{\left|B\right|\times n}$ be the sub-matrix of $A$ composed of the rows indexed by $B$ . For a fixed $A\in\mathbb{R}^{m\times n}$ let $\mathcal{B}=\left\{B\subseteq\left\llbracket m\right\rrbracket\,:\,\left|B\right|=\operatorname{rank}(A),\quad\operatorname{rank}\left(A_{B}\right)=\operatorname{rank}(A)\right\}$ , and for any $B\in\mathcal{B}$ and $b\in\mathbb{R}^{m}$ let $P\left(B,b\right):=\left\{x\in\mathbb{R}^{n}\,:\,A_{B}x\leq b_{B}\right\}$ and $\bar{x}\left(B,b\right)\in\mathbb{R}^{n}$ be an arbitrary solution of $A_{B}x=b_{B}$ .

Theorem 5.1 (Theorem 2 in blair90 )

Let $A\in\mathbb{R}^{m\times n}$ and for each $i\in\left\llbracket k\right\rrbracket$ let $b^{i}\in\mathbb{R}^{m}$ and $P^{i}=\left\{x\in\mathbb{R}^{m}\,:\,Ax\leq b^{i}\right\}$ . If

[TABLE]

then an ideal formulation of $x\in\bigcup_{i=1}^{k}P^{i}$ is given by

[TABLE]

Corollary 1 generalizes Theorem 5.1 as follows.

Corollary 2

Let $A\in\mathbb{R}^{m\times n}$ and for each $i\in\left\llbracket k\right\rrbracket$ let $b^{i}\in\mathbb{R}^{m}$ and $P^{i}=\left\{x\in\mathbb{R}^{m}\,:\,Ax\leq b^{i}\right\}$ . If for all $c\in\mathbb{R}^{n}$ there exist $B\in\mathcal{B}$ such that

[TABLE]

then (17) is an ideal formulation of $x\in\bigcup_{i=1}^{k}P^{i}$ .

Proof

For all $B\in\mathcal{B}$ let $\mathcal{C}^{B}:=\left\{C^{B,i}\right\}_{i=1}^{k}$ be such that $C^{B,i}=P\left(B,b^{i}\right)=P\left(B,{\bf 0}\right)+\bar{x}\left(B,b^{i}\right)$ for all $i\in\left\llbracket k\right\rrbracket$ . Condition (13a) is trivially satisfied and condition (13b) is satisfied by the corollary’s assumption. The result follows from Corollary 1 by noting that $\left\llbracket m\right\rrbracket=\bigcup_{B\in\mathcal{B}}B$ , that because $\operatorname{epi}\left(\gamma_{P\left(B,{\bf 0}\right)}\right)=\left\{\left(x,y\right)\in\mathbb{R}^{n+1}\,:\,y\geq 0,\quad A_{B}x\leq 0\right\}$ we have $\left(x,y\right)\in Q\left(\mathcal{C}^{B}\right)$ if and only if

[TABLE]

The sufficient condition of Theorem 5.1 implies that of Corollary 2, but the following example adapted from Mixed-Integer-Linear-Programming-Formulation-Techniques shows that the converse may not hold.

Example 4

Consider

[TABLE]

We can check that $B_{1}:=\left\{1,2,3\right\}\in\mathcal{B}$ , $\bar{x}\left(B_{1},b^{1}\right)=\left(0,1,1\right)\in P^{1}$ and $\bar{x}\left(B_{1},b^{2}\right)=\left(0,-1,2\right)\notin P^{2}$ . Furthermore,

[TABLE]

Then, neither Theorem 5.1 nor Corollary 2 are applicable and indeed formulation (17) for these matrix/vectors is not ideal ( $x=\left(0,0,3/2\right)$ and $y=\left(1/2,1/2\right)$ is an extreme point of its LP relaxation). However, if we augment $A$ , $b^{1}$ and $b^{2}$ with the redundant inequality $x_{3}\leq 1$ (i.e. let the fifth row of $A$ be $\left(0,0,1\right)$ and $b^{1}_{5}=b^{2}_{5}=1$ ) we have that $B_{2}:=\left\{1,2,5\right\}\in\mathcal{B}$ and

[TABLE]

Moreover, with this additional inequality/row we have that for any $u\in\mathbb{R}^{3}$ condition (18) either holds trivially (i.e. with $+\infty$ on both sides) or for a basis of the form $B=\left\{i,j,5\right\}$ for $i,j\in\left\llbracket 4\right\rrbracket$ . Hence, Corollary 2 shows that (17) for this augmented matrix/vectors does yield an ideal formulation for $x\in P^{1}\cup P^{2}$ . In contrast, we still have $\bar{x}\left(B_{1},b^{1}\right)\in P^{1}$ and $\bar{x}\left(B_{1},b^{2}\right)\notin P^{2}$ for the augmented matrix/vectors so Theorem 5.1 cannot be used to prove that this formulation is ideal.∎

Theorem 5.1 and Corollary 2 are based on exploiting a common tangent structure of the $P^{i}$ . This can also be useful to (partially) satisfy condition (13) for non-polyhedral sets so we give one formalization of the approach.

Lemma 3

Let $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ , $\left\{x^{j,i}\right\}_{j=1}^{m}\subseteq\operatorname{bd}\left(C^{i}\right)$ for all $i\in\left\llbracket k\right\rrbracket$ , $\mathcal{C}^{j}:=\left\{C^{j,i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ for all $j\in\left\llbracket m\right\rrbracket$ and $C^{j,0}\subseteq\mathbb{R}^{n}$ be a closed convex cone for all $j\in\left\llbracket m\right\rrbracket$ . If $C^{j,i}=T_{C^{i}}\left(x^{j,i}\right)=x^{j,i}+C^{j,0}$ for all $i\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket m\right\rrbracket$ , then $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}$ satisfies (13) for $U=\bigcup_{j=1}^{m}\left(C^{j,0}\right)^{*}$ .

Proof

Direct from $\sigma_{C}\left(u\right)=\sigma_{T_{C}\left(x\right)}\left(u\right)$ for all $x\in\operatorname{bd}\left(C\right)$ and $u\in T_{C}\left(x\right)^{*}$ .∎

5.2 Common tangent structure through Minkowski sum

Example 3 uses a “nearly-homothetic” variant of “conic” tangents of Lemma 3. For instance, as illustrated in Figure 2(b) for $s=(1,-1)$ and $\bar{x}=(5/4,-5/4)$ , we have that $C^{s,2}=\bar{x}+C^{s,0}_{\infty}$ is the cone tangent to $C^{2}$ at $\bar{x}$ , but no translation of $C^{s,0}_{\infty}$ is tangent to $C^{1}$ at some $x\in\operatorname{bd}\left(C^{1}\right)$ . However, $C^{s,1}=C^{1}+C^{s,0}_{\infty}$ serves the same role as the translation of $C^{s,0}_{\infty}$ through the following property.

Lemma 4

Let $C\subseteq\mathbb{R}^{n}$ be a closed convex set and $K\subseteq\mathbb{R}^{n}$ be a closed convex cone. Then $\sigma_{C}\left(u\right)=\sigma_{C+K}\left(u\right)$ for all $u\in K^{*}$ .

Proof

By Theorem C.3.3.2 in hiriart-lemarechal-2001 $\sigma_{C+K}\left(u\right)=\sigma_{C}\left(u\right)+\sigma_{K}\left(u\right)$ and $\sigma_{K}\left(u\right)=0$ for all $u\in K^{*}$ . ∎

The following example further illustrates this approach to guide the construction of $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}$ to use Corollary 1 for non-polyhedral sets with $k,n>2$ . It also illustrates how redundancy in $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}$ can simplify verification of (13).

Example 5

Let $b=\sqrt{2}-1$ , $r\in\mathbb{R}^{k}_{+}$ , $\left\{s^{i}\right\}_{i=1}^{k}\subseteq\left\{0,1\right\}^{2}$ , $\left\{\left(p_{0}^{i},p^{i}\right)\right\}_{i=1}^{k}\subseteq\mathbb{R}^{n+1}$ , $G^{i}:=\left\{\left(x_{0},x\right)\in\mathbb{R}^{n+1}\,:\,\begin{aligned} \left\lVert\left(x,s^{i}_{l}\right)\right\rVert_{2}&\leq s^{i}_{l}\left(\sqrt{2}-1\right)+1+(-1)^{l}x_{0}\quad\forall l\in\left\llbracket 2\right\rrbracket\end{aligned}\right\}$ and $C^{i}=\left(p_{0}^{i},p^{i}\right)+r_{i}G^{i}$ for all $i\in\left\llbracket k\right\rrbracket$ . Family $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}$ is depicted in Figure 3 for $k=2$ , $n=1$ , $r=(1,1)$ , $s^{1}=\left(1,0\right)$ , $s^{1}=\left(0,1\right)$ and $\left(p_{0}^{i},p^{i}\right)=\left(0,0\right)$ for all $i\in\left\llbracket 2\right\rrbracket$ . Let $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ be such that for each $j\in\left\llbracket 2\right\rrbracket$

[TABLE]

and $C^{j,i}=\left(p_{0}^{i}-(-1)^{j}(1-s^{i}_{j})r_{i},p^{i}\right)+r_{i}s^{i}_{j}C^{j,0}$ for all $i\in\left\llbracket k\right\rrbracket$ . Then $C^{2,0}_{\infty}=-C^{1,0}_{\infty}=\left\{\left(x_{0},x\right)\in\mathbb{R}^{n+1}\,:\,\left\lVert x\right\rVert_{2}\leq x_{0}\right\}$ and for all $j\in\left\llbracket 2\right\rrbracket$ and $i\in\left\llbracket k\right\rrbracket$ we have $C^{j,i}=T_{C^{i}}\left(p^{i}_{0}-(-1)^{j},p^{i}\right)$ if $s^{i}_{j}=0$ and $C^{j,i}=C^{i}+C^{j,0}_{\infty}$ if $s^{i}_{j}=1$ . Then $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ satisfies (13) for $U=\left(C^{1}_{\infty}\right)^{*}\cup\left(C^{2}_{\infty}\right)^{*}$ . Furthermore, for each $j\in\left\llbracket 2\right\rrbracket$

[TABLE]

For all $i\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket 2\right\rrbracket$ let $q^{i}_{j}=r_{i}s_{j}^{i}\sqrt{2}-(-1)^{j}p_{0}^{j}+(1-s_{j}^{i})r_{i}$ and $h^{i}_{j}=r_{i}s_{j}^{i}-(-1)^{j}p_{0}^{j}+(1-s_{j}^{i})r_{i}$ . Because $C^{i}=C^{1,i}\cap C^{2,i}$ formulation (15) for yields the valid formulation of $\left(x_{0},x\right)\in\bigcup_{i=1}^{k}C^{i}$ given by

[TABLE]

We have $\left(C^{1}_{\infty}\right)^{*}\cup\left(C^{2}_{\infty}\right)^{*}=C^{2}_{\infty}\cup C^{1}_{\infty}$ strictly contained in $\bigcap_{i=1}^{k}\operatorname{dom}\left(\sigma_{C^{i}}\right)\setminus\left\{0\right\}$ , so Corollary 1 does not imply idealness of (19). To check that it is indeed ideal let $\mathcal{C}^{3}$ be such that $C^{3,0}:=\left\{\left(x_{0},x\right)\in\mathbb{R}^{n+1}\,:\,\left\lVert x\right\rVert_{2}\leq 1+x_{0},\quad\left\lVert x\right\rVert_{2}\leq 1-x_{0}\right\}$ and $C^{3,i}=\left(p_{0}^{i},p^{i}\right)+r_{i}C^{3,0}$ for all $i\in\left\llbracket k\right\rrbracket$ . Then for all $i\in\left\llbracket k\right\rrbracket$ and $u\in\mathbb{R}^{n}\setminus\left(C^{2}_{\infty}\cup C^{1}_{\infty}\right)$ we have $C^{i}\subseteq C^{3,i}$ and $\sigma_{C^{3,i}}\left(u\right)=\sigma_{C^{i}}\left(u\right)$ . Finally, we have $\operatorname{epi}\left(\gamma_{C^{3}}\right)=\left\{\left(x_{0},x,y\right)\in\mathbb{R}^{n+2}\,:\,\left\lVert x\right\rVert_{2}\leq y+x_{0},\quad\left\lVert x\right\rVert_{2}\leq y-x_{0}\right\}$ .

We can now use Corollary 1 for $\left\{\mathcal{C}^{j}\right\}_{j=1}^{3}$ to construct an ideal formulation of $\left(x_{0},x\right)\in\bigcup_{i=1}^{k}C^{i}$ that corresponds to (19) plus the inequalities associated to $\mathcal{C}^{3}$ . However, these additional inequalities are precisely (19b). ∎

Note that in Example 3 the approach based on Lemma 4 of adding $C^{s,0}_{\infty}$ to $C^{1}$ in the definition of $C^{s,1}$ (or equivalently including $s_{j}x_{j}\leq 3/2$ in the definition of $C^{s,1}$ ) was enough to yield an ideal formulation and to verify property (13). In contrast, in Example 5 this approach was enough to yield an ideal formulation, but not to verify the property. We further discuss this in Section 6 where the boundary subsets highlighted in white in Figure 3 will play a similar role to those in Figure 2(b).

5.3 Constraints from power systems applications

A very clever technique to extend the applicability of Theorem 2.4 was introduced by bestuzheva2016convex in the context of power systems. The following example illustrates how this technique relates to the use of Lemma 4 and Corollary 1.

Example 6

Let $C^{1}:=[-1,1]\times\left\{0\right\}$ , $K\left(l,u\right):=\left\{x\in[l,u]\times[0,1]\,:\,x_{1}^{2}\leq x_{2}\right\}$ and $C^{2}:=K\left(-1,1\right)$ . We have that $\{C^{i}\}_{i=1}^{2}$ does not satisfy the assumptions of Theorem 2.4. However, bestuzheva2016convex notes that if $\tilde{C}^{2}=K\left(-1,0\right)$ or $\tilde{C}^{2}=K\left(0,1\right)$ , then (after a rotation) $\{C^{1},\tilde{C}^{2}\}$ does satisfy the assumptions. Hence, Theorem 2.4 can characterize $Q^{1}:=\operatorname{conv}\left(\left(C^{1}\times\left\{{\bf e}^{1}\right\}\right)\cup\left(K\left(-1,0\right)\times\left\{{\bf e}^{2}\right\}\right)\right)$ and $Q^{2}:=\operatorname{conv}\left(\left(C^{1}\times\left\{{\bf e}^{1}\right\}\right)\cup\left(K\left(0,1\right)\times\left\{{\bf e}^{2}\right\}\right)\right)$ . Then bestuzheva2016convex further notes $\operatorname{conv}\left(\left(C^{1}\times\left\{{\bf e}^{1}\right\}\right)\cup\left(K\left(-1,1\right)\times\left\{{\bf e}^{2}\right\}\right)\right)=\operatorname{conv}\left(Q^{1}\cup Q^{2}\right)$ and using the construction of the $Q^{i}$ from Theorem 2.4 shows that $Q^{1}\cup Q^{2}$ convex and

[TABLE]

To instead construct a formulation using Corollary 1 let $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ be such that $C^{j,0}=C^{2}+\{x\in\mathbb{R}^{2}\,:\,\left(-1\right)^{j}x_{1}\leq 0,\quad x_{2}=0\}$ , $C^{j,1}=\left(-1\right)^{j}{\bf e}^{1}+C^{j,0}_{\infty}=\{x\in\mathbb{R}^{2}\,:\,\left(-1\right)^{j}x_{1}\leq 1,\quad x_{2}=0\}$ and $C^{j,2}=C^{j,0}$ for each $j\in\left\llbracket 2\right\rrbracket$ . Then $C^{j,0}_{\infty}=\{x\in\mathbb{R}^{2}\,:\,\left(-1\right)^{j}x_{1}\leq 0,\quad x_{2}=0\}$ , $C^{j,1}=T_{C^{1}}(\left(-1\right)^{j},0)$ and $C^{j,2}=C^{2}+C^{j,0}_{\infty}$ for all $j\in\left\llbracket 2\right\rrbracket$ . Then $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ satisfies (13) for $U=\mathbb{R}^{2}$ , so we can use Corollary 1. To obtain an explicit algebraic description of the resultant formulation, first note that for each $j\in\left\llbracket 2\right\rrbracket$ we have $C^{j,0}=\{x\in\mathbb{R}^{2}\,:\,g_{j}(x_{1})\leq x_{2},\;x_{2}\leq 1\}$ , where

[TABLE]

This construction is illustrated in Figure 4(a) for $j=1$ , where $C^{1,0}$ is depicted in gray, the graph of $x_{1}^{2}$ is depicted by the solid black curve and the graph of $g_{1}(x_{1})$ is depicted by the dotted black curve. Figure 4(a) illustrates the convexity of $g_{j}(x_{1})$ , which we can use to conclude that for each $j\in\left\llbracket 2\right\rrbracket$ we have $\operatorname{epi}\left(\gamma_{C^{j,0}}\right)=\left\{\left(x,y\right)\in\mathbb{R}^{3}\,:\,\begin{aligned} ((\left(-1\right)^{j}x_{1})^{+})^{2}&\leq y\cdot x_{2},\;x_{2}&\leq y\end{aligned}\right\}$ . Finally, the ideal formulation for $x\in C^{1}\cup C^{2}$ from Corollary 1 is given by

[TABLE]

The continuous relaxation of this formulation is identical to $Q^{1}\cup Q^{2}$ .∎

The quadratic set considered in bestuzheva2016convex was an approximation of a trigonometric set bestuzheva2016convex ; hijazi2013convex . The following example shows that the Lemma 4 and Corollary 1 can also be applied directly to such sets.

Example 7

Let $C^{2}:=\left\{x\in[\pi,2\pi]\times[0,1]\,:\,\sin\left(x_{1}\right)+1\leq x_{2}\right\}$ , $C^{1}:=[\pi,2\pi]\times\left\{0\right\}$ , $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ be such that $C^{j,0}=C^{2}+\{x\in\mathbb{R}^{2}\,:\,\left(-1\right)^{j}x_{1}\leq 0,\quad x_{2}=0\}$ , $C^{1,1}=\pi{\bf e}^{1}+C^{1,0}_{\infty}$ , $C^{2,1}=2\pi{\bf e}^{1}+C^{2,0}_{\infty}$ , and $C^{j,2}=C^{j,0}$ for each $j\in\left\llbracket 2\right\rrbracket$ . Similarly to Example 6, we can check that $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ satisfies (13) for $U=\mathbb{R}^{2}$ and we can use Corollary 1. To obtain an explicit algebraic description of the resultant formulation we can again write $C^{1,0}=\{x\in\mathbb{R}^{2}\,:\,g_{1}(x_{1})\leq x_{2},\;x_{2}\leq 1\}$ , where now

[TABLE]

This construction is illustrated in Figure 4(b), where $C^{1,0}$ is depicted in gray, the graph of $\sin\left(x_{1}\right)+1$ is depicted by the solid black curve and the graph of $g_{1}(x_{1})$ is depicted by the dotted black curve. Figure 4(b) shows that while $g_{1}(x_{1})$ can be used to describe $C^{1,0}$ , $g_{1}(x_{1})$ is not convex. However, we can check that $h_{1}\left(x_{1}\right)=\max\left\{g_{1}\left(x_{1}\right),\pi-x_{1}\right\}$ is convex and we can replace $g_{1}$ by $h_{1}$ in the algebraic description of $C^{1,0}$ . This is illustrated in Figure 4(c), where again $C^{1,0}$ is depicted in gray, the graph of $\sin\left(x_{1}\right)+1$ is depicted by the solid black curve and now the the dotted black curve depicts the graph of $h_{1}(x_{1})$ . Using a similar reasoning for $C^{2,0}$ , we can check that for both $j\in\left\llbracket 2\right\rrbracket$ we have $C^{j,0}=\left\{x\in\mathbb{R}^{2}\,:\,f_{i}\left(x\right)\leq 0,\quad\left(-1\right)^{j}x_{1}\leq\left(-1\right)^{j}j\cdot\pi,\quad 0\leq x_{2}\leq 1\right\}$ , where $f_{1}\left(x\right)=1-x_{2}+h_{1}\left(x_{1}\right)=1-x_{2}+\max\left\{\sin\left(\max\left\{-x_{1}+3\pi,3\pi/2\right\}\right),\pi-x_{1}\right\}$ and $f_{2}\left(x_{1}\right)=1-x_{2}+\max\left\{\sin\left(\max\left\{x_{1},3\pi/2\right\}\right),x_{1}-2\pi\right\}$ . Finally, we can check that

[TABLE]

and $\operatorname{epi}\left(\gamma_{C^{j,0}}\right)=\{\left(x,y\right)\in\mathbb{R}^{3}\,:\,(\operatorname{cl}\tilde{f_{i}})(x,y)\leq 0,\quad\left(-1\right)^{j}x_{1}\leq\left(-1\right)^{j}j\cdot\pi\cdot y,$ $0\leq x_{2}\leq y\}$ . Then we can then use Corollary 1 to obtain the ideal formulation of $x\in C^{1}\cup C^{2}$ given by

[TABLE]

5.4 Generalization of Theorem 2.4

Lemma 4 and Proposition 5 can also be used to generalize Theorem 2.4 by combining them with Proposition 3.

Theorem 5.2

Let $\left\{v^{j}\right\}_{j=1}^{n}\subseteq\mathbb{R}^{n}$ be an orthonormal basis of $\mathbb{R}^{n}$ and for each $s\in\left\{-1,0,1\right\}^{n}$ let $K^{s}=\operatorname{cone}\left(\left\{s_{j}v^{j}\right\}_{j=1}^{n}\right)$ . In addition, let $\left\{G^{i}\right\}_{i=1}^{k}$ be closed convex sets in $\mathbb{R}^{n}$ such that ${\bf 0}\in G^{i}$ for all $i\in\left\llbracket k\right\rrbracket$ , $\left\{s^{i}\right\}_{i=1}^{k}\subseteq\left\{-1,1\right\}^{n}$ and $D^{i}=G^{i}\cap K^{s^{i}}$ for each $i\in\left\llbracket k\right\rrbracket$ be such that

$\left(D^{i}-K^{s^{i}}\right)\cap K^{s^{i}}=D^{i}$ * and is compact for all $i\in\left\llbracket k\right\rrbracket$ ,* 2. 2.

for all $t\in\left\{-1,1\right\}^{n}$ there exist disjoint sets $\left\{J^{t}_{i}\right\}_{i=1}^{k}$ such that $J^{t}_{i}\subseteq\left\llbracket n\right\rrbracket$ for all $i\in\left\llbracket k\right\rrbracket$ and $D^{i}+K^{t}=D^{i}\cap\operatorname{span}\left(\left\{v^{j}\right\}_{j\in J^{t}_{i}}\right)+K^{t}\quad\forall i\in\left\llbracket k\right\rrbracket$ .111with $\operatorname{span}\left(\emptyset\right)=\left\{\bf 0\right\}$ .

Finally, let $\left\{b^{i}\right\}_{i=1}^{k}$ and for all $i,l\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket n\right\rrbracket$ let $C^{i}:=D^{i}+b^{i}$ , $v^{i,j}:=s^{i}_{j}v^{j}$ , $\overline{b}^{i,l}_{j}=\max\left\{v^{i,j}\cdot x\,:\,x\in C^{l}\right\}$ for $i\neq l$ , $\overline{b}^{i,i}_{j}=v^{i,j}\cdot b^{i}$ , $L^{i}_{j}:=\min\left\{v^{j}\cdot x\,:\,x\in C^{i}\right\}$ and $U^{i}_{j}:=\max\left\{v^{j}\cdot x\,:\,x\in C^{i}\right\}$ . Then an ideal formulation for $x\in\bigcup_{i=1}^{k}C^{i}$ is given by

[TABLE]

In particular, if $G^{i}=\left\{x\in H^{i}\,:\,v^{i,j}\cdot\left(x+b^{i}\right)\leq\overline{b}^{i,i}_{j},\quad\forall j\in\left\llbracket n\right\rrbracket\right\}$ for a closed convex set $H^{i}\subseteq\mathbb{R}^{n}$ , then we can replace $\gamma_{G^{i}}$ by $\gamma_{H^{i}}$ in (21a).

Proof

For each $t\in\left\{-1,1\right\}^{n}$ and $i\in\left\llbracket k\right\rrbracket$ let $C^{s,i}:=C^{i}+K^{t}=D^{i}+K^{t}+b^{i}$ . We trivially have $C^{i}\subseteq C^{t,i}$ for all $t\in\left\{-1,1\right\}^{n}$ and $i\in\left\llbracket k\right\rrbracket$ . Furthermore, for all $t\in\left\{-1,1\right\}^{n}$ and $u\in-K^{t}$ we have $\sigma_{C^{s,i}}\left(u\right)=\sigma_{C^{i}}\left(u\right)$ for all $i\in\left\llbracket k\right\rrbracket$ because $\left(K^{t}\right)^{*}=-K^{t}$ and $C^{i}$ is compact. Then, because $\mathbb{R}^{n}=\bigcup_{t\in\left\{-1,1\right\}^{n}}K^{t}$ and Proposition 5 we have $Q\left(\mathcal{C}\right)=\bigcap_{t\in\left\{-1,1\right\}^{n}}Q\left(\mathcal{C}^{t}\right)$ for $\mathcal{C}^{t}:=\left\{C^{t,i}\right\}_{i=1}^{k}$ . Noting that $C^{t,i}:=C^{i}+K^{t}=D^{i}+K^{t}+b^{i}=D^{i}\cap\operatorname{span}\left(\left\{v^{j}\right\}_{j\in J^{t}_{i}}\right)+K^{t}+b^{i}$ and using $D^{i}=G^{i}\cap K^{s^{i}}$ we can use Proposition 3 to describe $Q\left(\mathcal{C}^{t}\right)$ . Noting that $u^{i,j}=v^{i,j}$ if $s_{j}^{i}=-t_{j}$ and $u^{i,j}=0$ other wise we have that this description is equal to

[TABLE]

where for all $i,l\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket n\right\rrbracket$ , $\hat{J}_{i}^{t}=\left\{j\in J_{i}^{t}\,:\,s_{j}^{i}=-t_{j}\right\}$ and

[TABLE]

where the first and last equality follow from $t_{j}v^{j}$ being a ray of $K^{t}$ and the second follows from the theorem’s assumptions. Because $C^{i}=b^{i}+D^{i}$ we have that (22b) for all $t\in\left\{-1,1\right\}^{n}$ is equivalent to (21b). To show that (22a) for all $t\in\left\{-1,1\right\}^{n}$ is equivalent to (21a) it suffices to note that if $\mu\in\mathbb{R}^{n}_{+}$ and $\lambda\in\mathbb{R}^{n}_{+}$ are such that $\mu_{j}\leq\lambda_{j}$ for all $j\in\left\llbracket n\right\rrbracket$ then $\gamma_{G^{i}}\left(\sum_{j=1}^{n}\mu_{j}v^{i,j}\right)\leq\gamma_{G^{i}}\left(\sum_{j=1}^{n}\lambda_{j}v^{i,j}\right)$ . For that assume for a contradiction that the reverse inequality holds for some $\mu$ and $\lambda$ . Then we can scale $\mu$ and $\lambda$ so that $\sum_{j=1}^{n}\mu_{j}v^{i,j}\notin G^{i}$ and $\sum_{j=1}^{n}\lambda_{j}v^{i,j}\in G^{i}$ . However, $\sum_{j=1}^{n}\lambda_{j}v^{i,j},\;\sum_{j=1}^{n}\mu_{j}v^{i,j}\in K^{s^{i}}$ so $\sum_{j=1}^{n}\mu_{j}v^{i,j}=\sum_{j=1}^{n}\lambda_{j}v^{i,j}-\sum_{j=1}^{n}(\lambda_{j}-\mu_{j})v^{i,j}\in\left(\left(D^{i}-K^{s^{i}}\right)\cap K^{s^{i}}\right)$ , which contradicts the the theorem’s assumptions. The final statement by noting that $\operatorname{epi}\left(\gamma_{G^{i}}\right)=\left\{\left(x,y\right)\in\operatorname{epi}\left(\gamma_{H^{i}}\right)\,:\,v^{i,j}\cdot\left(x+b^{i}y\right)\leq\overline{b}^{i,i}_{j}y,\quad\forall j\in\left\llbracket n\right\rrbracket\right\}$ .∎

Theorem 5.2 generalizes Theorem 2.4 in two ways. First by allowing unions of more than two sets. Second by relaxing the monotonicity requirement on the sets from a condition of the form $G^{i}_{\infty}=\mathbb{R}^{n}_{-}$ to one of the form $\left(G^{i}\cap\mathbb{R}^{n}_{+}-\mathbb{R}^{n}_{+}\right)\cap\mathbb{R}^{n}_{+}=G^{i}\cap\mathbb{R}^{n}_{+}$ . An example of a set that satisfies the latter condition, but not the former is the Euclidean ball. Theorem 5.2 achieves this by using a representation of the Minkowski sum based on the operation $\left(\cdot\right)^{+}$ (cf. Lemma 5), which can have some practical implications that we explore next.

5.5 Minkowski sum, formulation size and constraint representation

As noted in Hijazi ; Hijazioptonline14 , ensuring formulation (7) from Theorem 5.2 is ideal may require adding all exponentially many (in $n$ ) inequalities (7a) for each $i\in\left\llbracket 2\right\rrbracket$ . However, formulation (21) from Theorem 5.2 only requires one nonlinear inequality (21a) for each $i\in\left\llbracket k\right\rrbracket$ to be ideal. We now study this seeming paradox starting with an example that shows how and when an exponential number of inequalities (7a) are needed.

Example 8

Let $G^{1}:=\left\{x\in\mathbb{R}^{n}\,:\,\prod_{j=1}^{n}(2-x_{j})\geq 1,\;x_{j}\leq 2\;\forall j\in\left\llbracket n\right\rrbracket\right\}$ , $G^{2}=-2\cdot{\bf 1}+\mathbb{R}^{n}_{+}$ , $r=2-2^{-1/n}<2-2^{1/(1-n)}$ , $C^{1}=G^{1}\cap[0,r]^{n}$ and $C^{2}=G^{2}\cap[-2,0]^{n}$ . By Theorem 2.4 an ideal formulation for $x\in C^{1}\cup C^{2}$ is given by

[TABLE]

where we omitted $\gamma_{G^{2}}\left(\left[x\right]_{J}\right)\leq y_{2}$ as they are redundant because $\operatorname{epi}\left(\gamma_{G^{2}}\right)=\left\{\left(x,y\right)\,:\,x_{i}\geq-2y\;\forall i\in\left\llbracket n\right\rrbracket\right\}$ . Alternatively, if for any $a\in\mathbb{R}^{n}$ we let $\left[a\right]^{+}\in\mathbb{R}^{n}$ be such that ${\left[a\right]^{+}}_{j}=\left(a_{j}\right)^{+}$ , then by Theorem 5.2 an ideal formulation is given by

[TABLE]

where we again removed a redundant inequality associated to $\gamma_{G^{2}}$ . Finally,

[TABLE]

Now, by the selection of $r$ we have that for any $J\subseteq\left\llbracket n\right\rrbracket$ such that $\left|J\right|\leq n-1$ , having $-2y_{2}\leq x_{j}\leq ry_{1}$ for all $j\in\left\llbracket n\right\rrbracket$ implies

[TABLE]

Hence replacing (23a) or (24a) by $\gamma_{G^{1}}\left(x\right)\leq y_{1}$ also yields an ideal formulation.

In contrast, if we instead let $r=2$ , the replacement of (23a) or (24a) results in a valid, but not ideal formulation. Indeed, for any $J\subseteq\left\llbracket n\right\rrbracket$ such that $\left|J\right|\leq n-1$ , let $\left(x,y\right)\in\mathbb{R}^{n+2}$ given by $y_{1}=y_{2}=1/2$ , $x_{j}=1-2^{-n/\left|J\right|}(3/2)^{\left(\left|J\right|-n\right)/J}$ for $j\in J$ and $x_{j}=-1/2$ for $j\notin J$ . Then $\left(x,y\right)$ is feasible for the continuous relaxation of (23b)/(24b) and $\gamma_{G^{1}}\left(x\right)\leq y_{1}$ , but violates $\gamma_{G^{1}}\left(\left[x\right]_{J}\right)\leq y_{1}$ and $\gamma_{G^{1}}(\left[x\right]^{+})\leq y_{1}$ . Hence, in this case formulation (23) from Theorem 2.4 requires an exponential number of inequalities, while formulation (24) from Theorem 5.2 only requires a linear number of inequalities. However, the non-polyhedral nature of the inequalities makes such accounting a subtle matter. For instance, (23a) is equivalent to the single inequality $\max_{J\subseteq\left\llbracket n\right\rrbracket}\gamma_{G^{1}}\left(\left[x\right]_{J}\right)\leq y_{1}$ and in fact $\max_{J\subseteq\left\llbracket n\right\rrbracket}\gamma_{G^{1}}\left(\left[x\right]_{J}\right)=\gamma_{G^{1}}(\left[x\right]^{+})$ . Then (23a) or (24a) are different representations of the same convex constraint. Further insight into this can be gained by noting that (24a) (i.e. $\gamma_{G^{1}}(\left[x\right]^{+})\leq y_{1}$ ) is equivalent to

[TABLE]

Hence, (24a) can be thought of as the implicit description of linear sized extended formulation (26) of the exponential number of inequalities (23a). ∎

A detailed study of the size evaluation challenges illustrated in Example 8 is beyond the scope of this paper, but we make two observations about it.

The first concerns explicit formulation representations that can be fed to a MIP solver. Formulation (24a) can be explicitly represented using operation $(\cdot)^{+}$ or through extended formulation (26). The former could cause numerical issues due to the non-differentiability of $(\cdot)^{+}$ , while the auxiliary variables $z$ of the latter could have a similar detrimental effect as variable copies $x^{i}$ of formulation (2) from Theorem 2.1. In addition, we can use representation (25) of $\gamma_{G^{1}}$ or use standard second order cone (SOC) representations of the geometric mean that use additional auxiliary variables (e.g. ben2001lectures ). In contrast to variable copies $x^{i}$ , the auxiliary variables of such SOC representations have been shown to have a significant positive performance effect DBLP:conf/ipco/LubinYBV16 ; lubin2016polyhedral . Hence, these implementation alternatives must be carefully compared to ensure the potential performance gain of the significantly smaller formulation (24a) over (23b) (or even formulations based on Theorem 2.1) is achieved in practice. Similarly, the computational advantage of formulating trigonometric sets directly (as in Example 7) instead of a quadratic approximation (as in Example 7) is uncertain because of the high quality of the approximation from bestuzheva2016convex ; hijazi2013convex .

The second observation concerns the existence of linear-sized formulations that do not use operation $(\cdot)^{+}$ or additional continuous auxiliary variables $z$ . As noted in Example 8 this question is meaningless unless we give precise restrictions on the class of nonlinear inequalities we allow. Restricting to polynomial inequalities is not enough to achieve this goal, but the following example shows that it can still lead to interesting results and insights.

Example 9

The sets considered in Examples 1–3, in Example 5 and in Example 6 can be described by a finite number of polynomial inequalities. Such sets are usually denoted basic semi-algebraic and unions of such sets are usually denoted semi-algebraic sets. It is known that the convex hull of the union of basic semi-algebraic sets is semi-algebraic, but not necessarily basic semi-algebraic. Hence, if $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}$ is a finite family of basic semi-algebraic sets, then $Q\left(\mathcal{C}\right)$ may or may not be basic semi-algebraic as it is the convex hull of particularly structured sets. The continuous relaxations of (16) and (19) show that $Q\left(\mathcal{C}\right)$ is basic semi-algebraic for the sets in Examples 1–3 and in Example 5. However, we now show that it is not basic semi-algebraic for the sets in Example 6. For that take the affine section of the continuous relaxation of (20) obtained by fixing $y_{1}=y_{2}=1/2$ and which is given by

[TABLE]

This set is depicted in Figure 5 in gray where we can confirm that it is semi-algebraic (it is the convex hull of portions of two parabolas). However, we can check that the Zariski closure of its boundary (smallest algebraic variety that contains this boundary) is given by

[TABLE]

and depicted in black in Figure 5. We can also check that $Z\cap\operatorname{int}\left(M\right)\neq\emptyset$ , which is a known impediment for a set to be basic semi-algebraic andradas1994ubiquity ; blekherman2013semidefinite . ∎

Note that for the sets in Examples 1–3 and in Example 5 the description of the Minkoswki sum from Lemma 4 does not require the operation $(\cdot)^{+}$ and $Q\left(\mathcal{C}\right)$ is basic semi-algebraic. In contrast, the operation is required for Example 6 and $Q\left(\mathcal{C}\right)$ is not basic semi-algebraic. This shows that operation $(\cdot)^{+}$ can affect the properties of $Q\left(\mathcal{C}\right)$ and that this is strongly tied to the Minkoswki sum operation. In fact, using Proposition 6 below, Example 9 yields $C^{1}:=[-1,1]\times\left\{0\right\}$ and $C^{2}:=\left\{x\in[-1,1]\times[0,1]\,:\,x_{1}^{2}\leq x_{2}\right\}$ as examples of basic semi-algebraic sets whose Mikowski sum is not basic semi-algebraic.

6 Necessary and Sufficient Conditions for Piecewise Formulations

Example 5 shows how condition (13b) of Proposition 5 may not be necessary to obtain an ideal formulation. We now give necessary and sufficient strength conditions through a variant of (13a) that guarantees formulation validity.

Definition 5

Let $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ and $\mathcal{C}^{j}:=\left\{C^{j,i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ for $j\in\left\llbracket m\right\rrbracket$ be such that $C^{i}=\bigcap_{j=1}^{m}C^{j,i}$ for all $i\in\left\llbracket k\right\rrbracket$ so that a valid formulation of $x\in\bigcup_{i=1}^{k}C^{i}$ is given by

[TABLE]

We say (27) is ideal if its continuous relaxation is equal to $Q\left(\mathcal{C}\right)$ and sharp if the projection of this relaxation onto the $x$ variables is equal to $\operatorname{conv}\left(\bigcup_{i=1}^{k}C^{i}\right)$ .

Being sharp is a weaker strength requirement than being ideal (e.g. by Proposition 6 below, if (27) is ideal, then it is sharp), but can still result in good computational performance (e.g. (Mixed-Integer-Linear-Programming-Formulation-Techniques, , Section 2.2)). In fact, the polyhedral work of Balas, Blair and Jeroslow balas88 ; blair90 ; jeroslow88 considered in Section 5.1 focused on constructing sharp formulations and resulted in necessary conditions that can be stated in the context of Definition 5 as follows.

Theorem 6.1 (Theorem 3 in blair90 )

Let $A\in\mathbb{R}^{m\times n}$ and for each $i\in\left\llbracket k\right\rrbracket$ let $b^{i}\in\mathbb{R}^{m}$ and $P^{i}=\left\{x\in\mathbb{R}^{m}\,:\,Ax\leq b^{i}\right\}$ , for each $B\in\mathcal{B}$ let $\mathcal{C}^{B}:=\left\{C^{B,i}\right\}_{i=1}^{k}$ be such that $C^{B,i}=P\left(B,b^{i}\right)$ for all $i\in\left\llbracket k\right\rrbracket$ and $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}=\left\{\mathcal{C}^{B}\right\}_{B\in\mathcal{B}}$ . If (27) is not sharp then there exists $B\in\mathcal{B}$ , $i_{1},i_{2}\in\left\llbracket k\right\rrbracket$ and $u\in\mathbb{R}^{n}$ such that

[TABLE]

To extend Theorem 6.1 we use the following generalization to non-polyhedral sets of a known relation between the Cayley embedding and the Minkowski sum of polytopes caytrick ; karavelas2013maximum ; WeibelPhd . We present a proof in Section 8.3.

Proposition 6 ()

Let $\Delta^{k}:=\{\lambda\in\mathbb{R}^{k}_{+}\,:\,\sum_{i=1}^{k}\lambda_{i}=1\}$ , $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ , $Q\subseteq\mathbb{R}^{n+k}$ be a closed convex set such that $Q\subseteq\mathbb{R}^{n}\times\Delta^{k}$ and

[TABLE]

and $Q\left(\bar{y}\right):=\left\{x\in\mathbb{R}^{n}:\left(x,\bar{y}\right)\in Q\right\}$ . Then $Q\left(\mathcal{C}\right)\subseteq Q$ , $C_{\infty}=Q\left(\mathcal{C}\right)_{\infty}$ ,

[TABLE]

and the following are equivalent

$Q=Q\left(\mathcal{C}\right)$ . 2. 2.

$\forall\bar{y}\in\Delta^{k}\quad\quad\left(\bar{x},\bar{y}\right)\in Q\quad\Rightarrow\quad\bar{x}\in\sum_{i=1}^{k}\bar{y}_{i}C^{i}$ . 3. 3.

$\left(\bar{x},\frac{1}{k}\mathbf{1}\right)\in Q\quad\Rightarrow\quad\bar{x}\in\frac{1}{k}\sum_{i=1}^{k}C^{i}$ .

Theorem 6.2

For the $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}$ from Definition 5 and for any $\lambda\in\Delta^{k}$ let $Q^{m}\left(\lambda\right):=\bigcap_{j=1}^{m}\sum_{i=1}^{k}\lambda_{i}C^{j,i}$ . Formulation (27) is sharp if and only if

[TABLE]

Formulation (27) is ideal if and only if

[TABLE]

or equivalently if and only if

[TABLE]

Finally, the equivalences can be written as a function of $\sigma_{C^{j,i}}$ by noting that

[TABLE]

In particular, formulation (27) is ideal if and only if for all $u\in\mathbb{R}^{n}$

[TABLE]

Proof

By Theorem C.3.3.2 in hiriart-lemarechal-2001 we obtain the characterization of $\sigma_{Q^{m}\left(\lambda\right)}$ and that for all $u\in\mathbb{R}^{n}$ we have that $\sigma_{\operatorname{conv}\left(\bigcup_{i=1}^{k}C^{i}\right)}\left(u\right)=\max_{i=1}^{k}\sigma_{C^{i}}\left(u\right)$ , $\sigma_{\sum_{i=1}^{k}\lambda_{i}C^{i}}\left(u\right)=\sum_{i=1}^{k}\lambda_{i}\sigma_{C^{i}}\left(u\right)$ and $\sigma_{\bigcup_{\lambda\in\Delta^{k}}Q^{m}\left(\lambda\right)}\left(u\right)=\max_{\lambda\in\Delta^{k}}\sigma_{Q^{m}\left(\lambda\right)}\left(u\right)$ .

The result for being sharp follows from Proposition 6 implying $\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)=\bigcup_{\lambda\in\Delta^{k}}Q^{m}\left(\lambda\right)\times\left\{\lambda\right\}$ and hence that its projection onto the $x$ variables is $\bigcup_{\lambda\in\Delta^{k}}Q^{m}\left(\lambda\right)$ . The results for being ideal follow from Proposition 6 implying that $\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)=Q\left(\mathcal{C}\right)$ if and only if $Q^{m}\left(\lambda\right)=\sum_{i=1}^{k}\lambda_{i}C^{i}$ for all $\lambda\in\Delta^{k}$ or equivalently if $Q^{m}\left(\left(1/k\right){\mathbf{1}}\right)=(1/k)\sum_{i=1}^{k}C^{i}$ .∎

The conditions for formulation (27) being ideal and sharp from Theorem 6.2 can be contrasted by noting that for all $u\in\mathbb{R}^{n}$

[TABLE]

Hence, being sharp requires matching the maximum weighted average of the support functions while being ideal requires matching all weighted averages or equivalently the equal weight average or simply the sum.

The necessary and sufficient condition (30) for being ideal of Theorem 6.2 can in turn be contrasted with condition (13b) of Proposition 5 which requires

[TABLE]

For instance, condition (30) can be simplified to replace condition (13b) with the slightly weaker condition

[TABLE]

We can check that sets $\{\mathcal{C}^{j}\}_{j=1}^{2}$ in the first part of Example 5 satisfy condition (31), but only if we add recession cone $C^{j,0}_{\infty}$ following Lemma 4 (cf. the left side of Figure 3 where the dotted curve describes $C^{1,1}$ if we do not add the cone). Similarly to the comments after Example 3, one way to interpret the need to satisfy condition (31) for Example 5 is to ensure that there is a non-zero intersection of the normals to $\{C^{1,i}\}_{i=1}^{2}$ at the portions of the boundary highlighted in white in Figure 3. The following corollary formalized this idea into a sufficient condition that can be useful to verify that formulation (27) is ideal and/or to guide the construction of $\{\mathcal{C}^{j}\}_{j=1}^{m}$ to obtain an ideal formulation.

Corollary 3

Let the $\left\{\mathcal{C}^{j}\right\}_{j=1}^{m}$ from Definition 5 be such that $\operatorname{aff}\left(Q\left(\mathcal{C}\right)\right)\subseteq\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ , and for $\mathcal{D}:=\left\{\mathcal{D}^{i}\right\}_{i=1}^{k}$ in $\mathbb{R}^{n}$ let $L\left(\mathcal{D}\right):=\sum_{i=1}^{k}L\left(D^{i}\right)$ and $N\left(\mathcal{D}\right):=\left\{\left({x}^{i}\right)_{i=1}^{k}\in\operatorname*{\scalerel*{\times}{\sum}}\nolimits_{i=1}^{k}\operatorname{bd}\left(D^{i}\right)\,:\,L\left(\mathcal{D}\right)\cap\bigcap\nolimits_{i=1}^{k}N_{D^{i}}\left({x}^{i}\right)\neq\left\{\bf 0\right\}\right\}$ . Then formulation (27) is ideal if and only if

[TABLE]

Proof

We have that $Q\left(\mathcal{C}\right)=\bigcup_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ if and only if their affine hulls and relative boundaries match. Under the assumptions we have $Q\left(\mathcal{C}\right)\subseteq\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ , $\operatorname{aff}\left(Q\left(\mathcal{C}\right)\right)\subseteq\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ and $y_{i}\geq 0$ for all $i\in\left\llbracket n\right\rrbracket$ and $\left(x,y\right)\in\bigcap_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ . Hence, $Q\left(\mathcal{C}\right)=\bigcup_{j=1}^{m}Q\left(\mathcal{C}^{j}\right)$ if and only if portion (12) of the boundary characterization of $Q\left(\mathcal{C}\right)$ from Proposition 4 is equal to the union of the same portions for the $Q\left(\mathcal{C}^{j}\right)$ , which is equivalent to (32). ∎

We can check that sets $\{\mathcal{C}^{j}\}_{j=1}^{2}$ in the first part of Example 5 also satisfy condition (32) and redundant sets $\mathcal{C}^{3}$ are not needed to show formulation (19) is ideal. Now, in this case, the redundancy of $\mathcal{C}^{3}$ needed for Proposition 5 only resulted in easy to recognize duplicate inequalities in (19). However, the following example shows how using Corollary 3 instead of Proposition 5 can avoid more consequential redundancies.

Example 10

Consider again the sets from Example 8 given by $C^{1}=G^{1}\cap[0,r]^{n}$ for $G^{1}:=\left\{x\in\mathbb{R}^{n}\,:\,\prod_{j=1}^{n}(2-x_{j})\geq 1,\;x_{j}\leq 2\;\forall j\in\left\llbracket n\right\rrbracket\right\}$ , and $C^{2}=[-2,0]^{n}$ . The first version of these sets takes $r=2-2^{-1/n}$ and is depicted in Figure 6 for $n=3$ . The redundancy analysis in the example yielded the simplified version of the formulation from Theorems 2.4 and 5.2 for $x\in C^{1}\cup C^{2}$ given by

[TABLE]

An alternative way to get this formulation is by noting that the boundary of $C^{1}$ has a polyhedral portion associated to the variable bounds and a non-polyhedral portion associated to $G^{1}$ . This non-polyhedral portion is highlighted dark gray in Figure 6(a) for $n=3$ , and for all $n$ it can be sub-divided into $\operatorname{bd}\left(G^{1}\right)\cap\left(0,r\right)^{n}$ and $\operatorname{bd}\left(G^{1}\right)\cap\operatorname{bd}\left([0,r]^{n}\right)$ . If $x^{1}\in\operatorname{bd}\left(G^{1}\right)\cap\left(0,r\right)^{n}$ we have that $N_{C^{1}}\left(x^{1}\right)$ is contained in the strictly positive orthant. Hence for all $x^{1}\in\operatorname{bd}\left(G^{1}\right)\cap\left(0,r\right)^{n}$ we have $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}\right)$ if and if $x^{2}={\bf 0}$ , which is also highlighted in dark gray in Figure 6(b). In contrast, because of the choice of $r$ we have that if $x^{1}\in\operatorname{bd}\left(G^{1}\right)\cap\operatorname{bd}\left([0,r]^{n}\right)$ then there exist $J\subseteq\left\llbracket n\right\rrbracket$ such that $x^{1}\in\left\{x\in\operatorname{bd}\left(G^{1}\right)\,:\,x_{j}=r\quad\forall j\in J\right\}$ and $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}\right)$ if and if $x^{2}\in\bigcup_{j\in J}\left\{x\in[-2,0]^{n}\,:\,x_{j}=0\right\}$ . Then condition (32) is satisfied for $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ given by $C^{1,1}:=[0,r]^{n}$ , $C^{1,2}:=[-2,0]^{n}$ , $C^{2,1}=G^{1}$ , $C^{2,2}=\mathbb{R}^{n}_{-}$ . In particular, the key for satisfying the condition is that for all $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}\right)$ such that $x^{1}\in\operatorname{bd}\left(G^{1}\right)\cap\operatorname{bd}\left([0,r]^{n}\right)$ we have that $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}^{1}\right)$ . Finally, Corollary 3 with this decomposition yields precisely (33).∎

Our final example illustrates how Corollary 3 can be used to show Theorem 2.4 and give a geometric interpretation of the associated formulation.

Example 11

Consider now the second version of the sets from Example 8 which corresponds to the same sets in Example 10, but with $r=2$ . These sets depicted in Figure 7 for $n=3$ . We can again use $\left\{\mathcal{C}^{j}\right\}_{j=1}^{2}$ given by $C^{1,1}:=[0,r]^{n}$ , $C^{1,2}:=[-2,0]^{n}$ , $C^{2,1}=G^{1}$ , $C^{2,2}=\mathbb{R}^{n}_{-}$ to get valid formulation (33). However, from Example 8 we know that for this choice of $r$ this formulation is no longer ideal. Indeed, condition (32) of Corollary 3 is no longer satisfied because we no longer have $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}^{1}\right)$ for all $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}\right)$ such that $x^{1}\in\operatorname{bd}\left(G^{1}\right)\cap\operatorname{bd}\left([0,s]^{n}\right)$ . For instance, if $x^{1}\in D^{1}:=\operatorname{bd}\left(G^{1}\right)\cap\left((0,s)^{n-1}\times\left\{0\right\}\right)$ (highlighted in dark gray in Figure 7(a)) and $x^{2}\in D^{2}:=\operatorname{bd}\left(C^{2}\right)\cap\left(\left\{0\right\}^{n-1}\times[-r,0]\right)$ (highlighted in dark gray in Figure 7(b)) we have that $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}\right)$ , but $x^{1}\notin\operatorname{bd}\left(C^{1,1}\right)$ . This specific case can be resolved by adding $\mathcal{C}^{4}$ such that $C^{4,1}:=G^{1}+\operatorname{span}\left(\left\{{\bf e}^{n}\right\}\right)$ and $C^{4,2}:=G^{1}_{\infty}+\operatorname{span}\left(\left\{{\bf e}^{n}\right\}\right)=(-\infty,0]^{n-1}\times\left\{0\right\}$ ( $C^{4,1}$ is depicted in Figure 7(a) by the transparent meshed surface), as $\left(x^{1},x^{2}\right)\in N\left(\mathcal{C}^{4}\right)$ for all $\left(x^{1},x^{2}\right)\in D^{1}\times D^{2}$ . Similarly, we can resolve all additional cases and satisfy condition (32) by adding $\mathcal{C}^{J}$ such that $C^{J,1}:=G^{1}+\operatorname{span}\left(\left\{{\bf e}^{j}\right\}_{j\in J}\right)$ and $C^{J,2}:=G^{1}_{\infty}+\operatorname{span}\left(\left\{{\bf e}^{j}\right\}_{j\in J}\right)$ for all $J\subseteq\left\llbracket n\right\rrbracket$ with $\left|J\right|\leq n-1$ . By noting that $\gamma_{C^{J,1}}\left(x\right)=\gamma_{G^{1}}\left(\left[x\right]_{J}\right)$ , we have that the formulation obtained from Corollary 3 for $\left\{\mathcal{C}^{J}\right\}_{J\subseteq\left\llbracket n\right\rrbracket}$ is precisely formulation (23) obtained from Theorem 2.4.∎

7 Conclusions

Modeling disjunctive constraints with ideal MIP formulations that avoid the variable copies of standard convex hull formulations can provide a computational advantage over both convex hull and Big-M formulations lodi15 ; Hijazi . Unfortunately, existing techniques to construct such formulations are restricted to special structures or require ad-hoc decomposition techniques. In this paper, we introduced systematic and generic construction tools for these formulations. The tools do require understanding the geometry of the disjunctive constraints. However, when this understanding is available, the techniques are easily applicable even for high-dimensional constraints, constraints with a large number of terms, and highly non-polyhedral constraints (e.g. Example 5). The resultant formulations can usually be represented in a format compatible with MIP solvers through simple gauge-calculus (e.g. Lemma 2). However, these representations may include maximum operations that could lead to differentiability issues (e.g. Examples 6, 7 and 8). Such issues can be avoided through standard linear programming tricks, but these introduce continuous auxiliary variables that are similar to the variable copies we aimed to avoid. Nonetheless, these auxiliary variables may not necessarily have the same negative computational effect as the variable copies. Finally, as illustrated in Example 8, the max operation and the continuous auxiliary variables can sometimes be avoided with little or no loss of formulation strength, and, as illustrated in lodi15 ; Hijazi , even when some strength is lost, the formulations can still provide an advantage.

8 Omitted Proofs

8.1 Theorem 2.1 and Proposition 1

Proof (of Theorem 2.1)

Validity is direct from Lemma 1, $C^{i}_{\infty}=\left(C^{i}-a^{i}\right)_{\infty}$ and $\mathcal{C}\in\mathbb{C}_{n}$ . For idealness, let $Q$ be the continuous relaxation of (2) and assume for a contradiction that there exist a minimal face $F$ of $Q$ and $(x,y)\in F$ with $y\notin\left\{0,1\right\}^{k}$ . Without loss of generality $y_{1},y_{2}\in(0,1)$ . Let $\varepsilon=\min\{y_{1},y_{2},1-y_{1},1-y_{2}\}\in(0,1)$ , $\underline{y}_{1}=\overline{y}_{2}=y_{1}+\varepsilon$ , $\underline{y}_{2}=\overline{y}_{1}=y_{2}-\varepsilon$ , $\underline{y}_{i}=\overline{y}_{i}=y_{i}$ for all $i\notin\left\{1,2\right\}$ , $\underline{x}^{i}=(\underline{y}_{i}/y_{i})x^{i}$ and $\overline{x}^{i}=(\overline{y}_{i}/y_{i})x^{i}$ for $i\in\left\{1,2\right\}$ , $\underline{x}^{i}=\overline{x}^{i}=x^{i}$ for all $i\notin\left\{1,2\right\}$ , $\underline{x}=\sum_{i=1}^{k}\underline{x}^{i}$ and $\overline{x}=\sum_{i=1}^{k}\overline{x}^{i}$ . Then $(\underline{x},\underline{y})\neq(\overline{x},\overline{y})$ , $(x,y)=(1/2)(\underline{x},\underline{y})+(1/2)(\overline{x},\overline{y})$ . Multiplying $\gamma_{C^{i}-b^{i}}\left(x^{i}-b^{i}y_{i}\right)\leq y_{i}$ for $i\in\left\{1,2\right\}$ by $\underline{y}_{i}/y_{i}$ or $\overline{y}_{i}/y_{i}$ and using the positive homogeneity of $\gamma_{C^{i}-b^{i}}$ we have $(\underline{x},\underline{y}),(\overline{x},\overline{y})\in Q$ . Hence, $(\underline{x},\underline{y}),(\overline{x},\overline{y})\in F$ . Furthermore, by construction either $\underline{y}_{1}=1$ , $\underline{y}_{2}=0$ , $\overline{y}_{1}=0$ or $\overline{y}_{2}=1$ . If $\underline{y}_{1}=1$ , then $\left\{\left(x,y\right)\in F\,:\,y_{1}=1\right\}\subsetneq F$ is a face of the continuous relaxation of (2), which contradicts the minimality of $F$ . All other three cases are analogous. The final statement follows from the recession cone of the continuous relaxation of (2) being equal to all $\left(x,y\right)$ such that $x\in C^{1}_{\infty}$ , $y=0$ and $x^{i}\in C^{1}_{\infty}$ for all $i\in\left\llbracket k\right\rrbracket$ . ∎

Proof (of Proposition 1)

Part 1 follow directly from Corollary 9.8.1 in rockafellar2015convex by noting that if $\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ then $\left\{C^{i}\times\left\{e^{i}\right\}\right\}_{i=1}^{k}\in\mathbb{C}_{n+1}$ . For part 2 note that we have that $\left(x,y\right)\in Q\left(\mathcal{C}\right)$ if and only if $y\in\mathbb{R}^{k}_{+}$ , $\sum_{i=1}^{k}y_{i}=1$ and

[TABLE]

The result follows directly if (34) is equivalent to

[TABLE]

To show this equivalence first note that $\tilde{x}^{i}\in C^{i}$ if and only if $\gamma_{C^{i}-b^{i}}\left(\tilde{x}^{i}-b^{i}\right)\leq 1$ , and if $y_{i}>0$ this last condition is in turn equivalent to $\tilde{x}^{i}=x^{i}/y_{i}$ for some $x^{i}\in\mathbb{R}^{n}$ such that $\gamma_{C^{i}-b^{i}}\left(x^{i}-b^{i}y_{i}\right)\leq y_{i}$ . Then note that if $y_{i}=0$ , then $\gamma_{C^{i}-b^{i}}\left(x^{i}-b^{i}y_{i}\right)\leq y_{i}$ if and only if $x^{i}\in C^{i}_{\infty}$ . To show that (34) implies (35) simply let $x^{i}=y_{i}\tilde{x}^{i}$ . For the reverse implication assume without loss of generality that $y_{1}>0$ and let $I_{0}=\left\{i\in\left\llbracket k\right\rrbracket\,:\,y_{i}=0\right\}$ . Then $\tilde{x}^{1}=x^{1}/y_{1}+\sum_{i\in I_{0}}x^{i}\in C^{1}$ by $\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ . Finally, the implication follows because $\tilde{x}^{i}=x^{i}/y_{i}\in C^{i}$ for all $i\in\left\llbracket k\right\rrbracket\setminus I_{0}$ . Part 3 follows directly from part 2. ∎

8.2 Proof of Proposition 3

Lemma 5

Let $C\subseteq\mathbb{R}^{n}$ be a closed convex set containing $\bf 0$ , $\left\{v^{j}\right\}_{j=1}^{n}\subseteq\mathbb{R}^{n}$ be an orthonormal basis of $\mathbb{R}^{n}$ , $s\in\left\{-1,0,1\right\}^{n}$ , $t\in\left\{-1,1\right\}^{n}$ , $K=\operatorname{cone}\left(\left\{s_{j}v^{j}\right\}_{j=1}^{n}\right)$ , $M=\operatorname{cone}\left(\left\{t_{j}v^{j}\right\}_{j=1}^{n}\right)$ and $u^{j}=(-s_{j}t_{j})^{+}s_{j}v^{j}$ for all $j\in\left\llbracket n\right\rrbracket$ . If $C\cap K$ is compact and $\left(\left(C\cap K\right)-K\right)\cap K=C\cap K$ then $\left(x,y\right)\in\operatorname{epi}\left(\gamma_{C\cap K+M}\right)$ if and only if

[TABLE]

Proof

Let $E$ be the region described by (36). We have that $E$ is a closed convex cone such that $y\geq 0$ for all $\left(x,y\right)\in E$ so by Lemma 2 we just need to show that $\left(x,0\right)\in E$ is equivalent to $x\in\left(C\cap K+M\right)_{\infty}=M$ and that $\left(x,1\right)\in E$ is equivalent to $x\in C\cap K+M$ .

For the first implication of both equivalence let $y\in\left\{0,1\right\}$ , $C_{1}=C$ , $C_{0}=C_{\infty}$ , $\left(x,y\right)\in E$ , $J=\left\{j\in\left\llbracket n\right\rrbracket\,:\,u^{j}\cdot x>0\right\}$ , $x^{C}=\sum_{j\in J}v^{j}v^{j}\cdot x$ and $x^{M}=\sum_{j\in\left\llbracket k\right\rrbracket\setminus J}v^{j}v^{j}\cdot x$ . Because $\left\{v^{j}\right\}_{j=1}^{n}\subseteq\mathbb{R}^{n}$ is an orthonormal basis we have $x=x^{C}+x^{M}$ . Furthermore, $\sum\nolimits_{j=1}^{n}u^{j}(u^{j}\cdot x)^{+}=\sum\nolimits_{j=1}^{n}v^{j}v^{j}\cdot x^{C}=x^{C}$ so $x^{C}\in C_{y}$ , and $s_{j}v^{j}\cdot x^{C}>0$ for all $j\in J$ and $s_{j}v^{j}\cdot x^{C}=0$ for all $j\in\left\llbracket n\right\rrbracket\setminus J$ so $x^{C}\in K$ . Finally, if $j\in\left\llbracket n\right\rrbracket\setminus J$ , then either (i) $s=-t$ and $t_{j}v^{j}\cdot x\geq 0$ , or (ii) $s\in\left\{0,t\right\}$ . In the second case the linear inequalities of (36) imply $t_{j}v^{j}\cdot x\geq 0$ . Then $t_{j}v^{j}\cdot x^{M}\geq 0$ for all $j\in\left\llbracket n\right\rrbracket\setminus J$ and $t_{j}v^{j}\cdot x^{M}=0$ for all $j\in J$ . Hence, $x^{M}\in M$ and for $y=1$ we have $x\in C\cap K+M$ . Similarly for $y=0$ we have $x\in C_{\infty}\cap K+M=\left(C\cap K\right)_{\infty}+M=M$ (cf. Proposition A.2.2.5 in hiriart-lemarechal-2001 ).

For the reverse implication of the first equivalence note that if $x\in M$ , then $\left(x,0\right)\in E$ because $x$ satisfies the linear inequalities of (36) and $\left(u^{j}\cdot x\right)^{+}=0$ for all $j\in\left\llbracket n\right\rrbracket$ . For the second equivalence let $x=x^{C}+x^{M}$ with $x^{C}\in C\cap K$ and $x^{M}\in M$ . Then $x$ satisfies the linear inequalities of (36) because both $x^{C}$ and $x^{M}$ satisfy them. Now let $J:=\left\{j\in\left\llbracket n\right\rrbracket\,:\,s_{j}=-t_{j},\;s_{j}v^{j}\cdot\left(x^{C}+x^{M}\right)>0\right\}$ and $\tilde{x}:=\sum\nolimits_{j\in\left\llbracket n\right\rrbracket\setminus J}v^{j}v^{j}\cdot x^{C}\quad$ $+\sum\nolimits_{j\in J}v^{j}\left(-v^{j}x^{M}\right)$ . By the definition of $J$ and because $x^{M}\in M$ and $x^{C}\in K$ we have $\tilde{x}\in K$ and $x^{C}-\tilde{x}=\sum_{j\in J}v^{j}v^{j}\cdot\left(x^{C}+x^{M}\right)\in K$ . Then $x^{C}-\tilde{x}\in\left(\left(C\cap K\right)-K\right)\cap K$ and hence by the assumption on $C$ and $K$ we have $x^{C}-\tilde{x}\in C$ . Then $x$ satisfies non-linear inequality of (36) because $J=\left\{j\in\left\llbracket n\right\rrbracket\,:\,u^{j}\cdot\left(x^{C}+x^{M}\right)>0\right\}$ and hence $x^{C}-\tilde{x}=\sum_{j\in J}v^{j}v^{j}\cdot\left(x^{C}+x^{M}\right)=\sum\nolimits_{j=1}^{n}u^{j}(u^{j}\cdot\left(x^{C}+x^{M}\right))^{+}$ .∎

Proof (of Proposition 3)

The result will follow from Proposition 1 by showing that (10) is the projection of the continuous relaxation of (2) for the considered sets. Noting that $(-0t)^{+}0=0$ for all $t\in\left\{-1,1\right\}$ we can use Lemma 5 to show that the continuous relaxation of (2) is given by

[TABLE]

Now, for all $i\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket n\right\rrbracket$ such that $s_{j}=0$ or $s_{j}=t_{j}$ we have $\underline{b}^{i}_{j}=t_{j}v^{j}\cdot b^{i}$ , so (37b) is dominated by $t_{j}v^{j}\cdot x^{i}-\underline{b}^{i}_{j}y_{i}\geq 0$ for all $i\in\left\llbracket k\right\rrbracket$ and $j\in\left\llbracket n\right\rrbracket$ (the additional inequalities for case $s_{j}=t_{j}$ are clearly valid). To show that (10) is contained in the projection of (37) let $\left(x,y\right)$ be feasible for (10) and for all $i\in\left\llbracket k\right\rrbracket$ let $\lambda_{j}^{i}:=y_{i}t_{j}\underline{b}^{i}_{j}$ if $j\in\left\llbracket n\right\rrbracket\setminus J_{i}$ and $\lambda_{j}^{i}=v^{j}\cdot x-\sum_{l\in\left\llbracket k\right\rrbracket\setminus\left\{i\right\}}\lambda_{j}^{l}$ if $j\in J_{i}$ . Finally, for all $i\in\left\llbracket k\right\rrbracket$ let $x^{i}=\sum_{j=1}^{n}\lambda_{j}^{i}v^{j}$ . We can check that $\left(x,\left(x^{i}\right)_{i=1}^{k},y\right)$ is feasible for (37). In particular, $\left(x,y\right)$ feasible for (10a) implies $\left(x^{i},y_{i}\right)$ is feasible for (37b) because $u^{i,j}=0$ if $s^{i}_{j}\neq-t_{j}$ and if $s^{i}_{j}=-t_{j}$ , then $\left(-s_{j}^{i}t_{j}\right)^{+}s_{j}^{i}t_{j}=-1$ , $\underline{b}_{j}^{l}=-\bar{b}^{i,l}_{j}$ , and hence $u^{i,j}\cdot\left(x^{i}-b^{i}y_{i}\right)=u^{i,j}\cdot(x-b^{i}y_{i})+\sum_{l\in\left\llbracket k\right\rrbracket\setminus\left\{i\right\}}y_{i}\underline{b}^{i}_{j}=u^{i,j}\cdot x-\sum\nolimits_{l=1}^{k}\overline{b}^{i,l}_{j}y_{l}$ . The reverse inclusion follows from validity of (10) plus $y\in\mathbb{Z}^{k}$ as a formulation for $x\in\bigcup_{i=1}^{k}C^{i}$ .∎

8.3 Proof of Proposition 4

Proposition 7

For a closed convex set $C\subseteq\mathbb{R}^{n}$ we have that $\operatorname{rbd}\left(C\right)=\bigcup_{d\in D\cap L(C)}F_{C}\left(d\right)$ for $D=L(C)\setminus\left\{0\right\}$ or $D=L(C)\cap\operatorname{dom}\left(\sigma_{C}\right)\setminus\left\{0\right\}$ . In addition, if $u\in\mathbb{R}^{n}$ and $w\in L(C)^{\perp}$ , then $F_{C}(u)=F_{C}(u-w)$ .

Proof

The proof of the first statement is identical to that of Proposition C.3.1.5 in hiriart-lemarechal-2001 . For the second note that by Definition C.2.1.4 and Proposition C.1.1.7 we have that $\sigma_{C}(w)=\sigma_{C}(-w)$ and $\sigma_{C}(u-w)=\sigma_{C}(u)+\sigma_{C}(-w)$ . Furthermore, $-w\cdot x=\sigma_{C}(-w)$ for all $x\in C$ . Then for any $x\in C$ we have

[TABLE]

Lemma 6

Let $\mathcal{C}:=\left\{C^{i}\right\}_{i=1}^{k}\in\mathbb{C}_{n}$ , $u\in\mathbb{R}^{n}$ and $v\in\mathbb{R}^{k}$ . Then $\sigma_{Q\left(\mathcal{C}\right)}\left(u,v\right)=\max_{i=1}^{k}\sigma_{C^{i}}\left(u\right)+v\cdot e^{i}$ and $F_{Q\left(\mathcal{C}\right)}\left({u,v}\right)=\operatorname{conv}\left(\bigcup_{i\in I\left(u,v\right)}F_{C^{i}}\left(u\right)\times e^{i}\right)$ for $I\left(u,v\right):=\left\{i\in\left\llbracket k\right\rrbracket\,:\,\sigma_{C^{i}}\left(u\right)+v\cdot e^{i}=\sigma_{Q\left(\mathcal{C}\right)}\left({u,v}\right)\right\}$ .

Proof

The characterization of $\sigma_{Q\left(\mathcal{C}\right)}\left(u,v\right)$ is direct from Theorem C.3.3.2 in hiriart-lemarechal-2001 . For the characterization of the face of $Q\left(\mathcal{C}\right)$ exposed by $\left(u,v\right)$ note that $\left(x,y\right)\in F_{Q\left(\mathcal{C}\right)}\left({u,v}\right)$ if and only if there exist $\lambda\in\Delta^{k}:=\left\{\lambda\in\mathbb{R}^{k}_{+}\,:\,\sum_{i=1}^{k}\lambda_{i}=1\right\}$ and $x^{i}\in C^{i}$ for $i\in\left\llbracket k\right\rrbracket$ such that $x=\sum_{i=1}^{k}\lambda_{i}x^{i}$ , $y=\sum_{i=1}^{k}\lambda_{i}e^{i}$ and

[TABLE]

By the definition of $\sigma_{C^{i}}$ and the characterization of $\sigma_{Q\left(\mathcal{C}\right)}$ for all $i\in\left\llbracket k\right\rrbracket$

[TABLE]

So if $\lambda_{i}>0$ in (38) for $i\in\left\llbracket k\right\rrbracket$ then both inequalities in (39) hold as equalities for $i$ . Then (38) holds if and only if for all $i\in\left\llbracket k\right\rrbracket$ with $\lambda_{i}>0$ we have (i) $u\cdot x^{i}=\sigma_{C^{i}}\left(u\right)$ or equivalently $x^{i}\in F_{C^{i}}\left(u\right)$ , and (ii) $\sigma_{C^{i}}\left(u\right)+v\cdot e^{i}=\sigma_{Q\left(\mathcal{C}\right)}\left({u,v}\right)$ .∎

Proof (of Proposition 4)

Let $E:=\{\left(x,y\right)\in\mathbb{R}^{n+k}\,:\,\sum\nolimits_{i=1}^{k}y_{i}=1,\quad Ax=\sum\nolimits_{i=1}^{k}Ab^{i}y_{i}\}$ . The inclusion $\operatorname{aff}\left(Q\left(\mathcal{C}\right)\right)\subseteq E$ follows by noting that $\operatorname{aff}\left(C^{i}\right)\subseteq\left\{x\in\mathbb{R}^{n}\,:\,Ax=Ab^{i}\right\}$ and hence $\left(x,{\bf e}^{i}\right)\in E$ for all $x\in C^{i}$ . For the reverse inclusion let $\left(x,y\right)\in E$ and $\bar{x}=\left(x-\sum_{i=1}^{k}b^{i}y_{i}\right)$ . Then $A\bar{x}={\bf 0}$ , so $\bar{x}\in L\left(\mathcal{C}\right)$ and hence there exist $\bar{x}^{i}\in L\left(C^{i}\right)$ for $i\in\left\llbracket k\right\rrbracket$ such that $\bar{x}=\sum_{i=1}^{k}\bar{x}^{i}$ . For any $i\in\left\llbracket k\right\rrbracket$ and $\lambda\neq 0$ we have $\bar{x}^{i}/\lambda\in L\left(C^{i}\right)$ , $\bar{x}^{i}/\lambda+b^{i}\in\operatorname{aff}\left(C^{i}\right)$ and hence $\left(\bar{x}^{i}/\lambda+b^{i},{\bf e}^{i}\right)\in\operatorname{aff}\left(C^{i}\times\left\{{\bf e}^{i}\right\}\right)\subseteq\operatorname{aff}\left(Q\left(\mathcal{C}\right)\right)$ . In particular, for any any $i\in\left\llbracket k\right\rrbracket$ we have $\left(\bar{x}^{i},{\bf 0}\right)=\left(\bar{x}^{i}/2+b^{i},{\bf e}^{i}\right)-\left(-\bar{x}^{i}/2+b^{i},{\bf e}^{i}\right)\in L\left(Q\left(\mathcal{C}\right)\right)$ and if $y_{i}\neq 0$ we have $\left(\bar{x}^{i}/y_{i}+b^{i},{\bf e}^{i}\right)\in\operatorname{aff}\left(Q\left(\mathcal{C}\right)\right)$ . Then, letting $I_{0}=\left\{i\in\left\llbracket k\right\rrbracket\,:\,y_{i}=0\right\}$ and $I_{1}=\left\llbracket n\right\rrbracket\setminus I_{0}$ we have $x=\sum_{i\in I_{1}}y_{i}\left(\bar{x}^{i}/y_{i}+b^{i},{\bf e}^{i}\right)+\sum_{i\in I_{0}}\left(\bar{x}^{i},{\bf 0}\right)\in\operatorname{aff}\left(Q\left(\mathcal{C}\right)\right)$ .

Then by Proposition 7 we have $\operatorname{rbd}\left(Q\left(\mathcal{C}\right)\right)=\bigcup\nolimits_{\left(u,v\right)\in L\left(\mathcal{C}\right)\setminus\left\{0\right\}}F_{Q\left(\mathcal{C}\right)}\left(u,v\right)$ . The result will follow by refining the right hand side of this inclusion to include only the $F_{Q\left(\mathcal{C}\right)}\left(u,v\right)$ that are maximal with respect to inclusion.

We begin by showing that (11) corresponds to the maximal faces when $u\in\mathcal{U}\left(\mathcal{C}\right)$ . Indeed, from Lemma 6 we only need to show that for all $\bar{u}\in\mathcal{U}\left(\mathcal{C}\right)$ there exist $\left(u,v\right)\in L\left(\mathcal{C}\right)\setminus\left\{0\right\}$ such that $I\left(u,v\right)=\left\llbracket k\right\rrbracket$ and $F_{C^{i}}\left(\bar{u}\right)=F_{C^{i}}\left(u\right)$ for all $i\in\left\llbracket k\right\rrbracket$ . For that first let $\bar{v}\in\mathbb{R}^{k}$ be such that $\bar{v}_{1}=-\frac{1}{k}\sum_{i=2}^{k}\left(\sigma_{C^{1}}(\bar{u})-\sigma_{C^{i}}(\bar{u})\right)$ and $\bar{v}_{j}=\sigma_{C^{1}}(\bar{u})-\sigma_{C^{j}}(\bar{u})-\frac{1}{k}\sum_{i=2}^{k}\left(\sigma_{C^{1}}(\bar{u})-\sigma_{C^{i}}(\bar{u})\right)$ for all $j\in\left\llbracket k\right\rrbracket\setminus\left\{1\right\}$ . Then, $I\left(\bar{u},\bar{v}\right)=\left\llbracket k\right\rrbracket$ and $\sum_{i=1}^{k}\bar{v}_{i}=0$ . If $\bar{u}-\sum\nolimits_{i=1}^{k}b^{i}\bar{v}_{i}\in L\left(\mathcal{C}\right)$ we are done by letting $\left(u,v\right)=\left(\bar{u},\bar{v}\right)$ . If not, there exist $w\in L\left(\mathcal{C}\right)^{\perp}$ such that $u-\sum\nolimits_{i=1}^{k}b^{i}v_{i}\in L\left(\mathcal{C}\right)$ for $u=\bar{u}-w$ and $v=\bar{v}$ . Now, for any $i\in\left\llbracket k\right\rrbracket$ we have $w\in L\left(\mathcal{C}\right)^{\perp}\subseteq L\left(C^{i}\right)^{\perp}$ and hence by Proposition 7 we have $F_{C^{i}}\left(\bar{u}\right)=F_{C^{i}}\left(\bar{u}-w\right)=F_{C^{i}}\left(u\right)$ . In particular, for all $i\in\left\llbracket k\right\rrbracket$ there exist $x^{i}\in F_{C^{i}}\left(u\right)$ such that $\sigma_{C^{i}}(\bar{u}-w)=x^{i}\cdot\left(\bar{u}-w\right)$ and $\sigma_{C^{i}}(\bar{u})=x^{i}\cdot\bar{u}$ . Then ${\sigma_{C^{1}}(\bar{u})-\sigma_{C^{i}}(\bar{u})}={\sigma_{C^{1}}(\bar{u}-w)-\sigma_{C^{i}}(\bar{u}-w)}$ for all $i\in\left\llbracket n\right\rrbracket$ and hence $I\left({u},{v}\right)=\left\llbracket k\right\rrbracket$ and $\sum_{i=1}^{k}{v}_{i}=0$ by the definition of $\bar{v}=v$ and $u=\bar{u}-w$ .

We can also check that (12) corresponds to the maximal faces exposed by $\left(0,v\right)$ for $v\in\mathbb{R}^{k}$ , which are precisely those exposed when there exist $i\in\left\llbracket n\right\rrbracket$ such that $v_{i}=1-k$ and $v_{j}=1$ for $i\neq j$ .

The last case is $u\in L\left(\mathcal{C}\right)\setminus\left\{0\right\}$ and there exist $\emptyset\neq I\subseteq\left\llbracket k\right\rrbracket$ such that $F_{C^{i}}\left(u\right)=\emptyset$ for $i\in I$ and $F_{C^{i}}\left(u\right)\neq\emptyset$ for $i\in\left\llbracket n\right\rrbracket\setminus I$ . An analog argument to case $u\in\mathcal{U}\left(\mathcal{C}\right)$ shows that the maximal faces here correspond to $\left(u,v\right)\in L\left(\mathcal{C}\right)\setminus\left\{0\right\}$ such that $I\left(u,v\right)=\left\llbracket n\right\rrbracket\setminus I$ . However, those faces are contained in $\operatorname{conv}\left(\bigcup\nolimits_{j\neq i}C^{j}\times\left\{{\bf e}^{j}\right\}\right)$ for any $i\in I$ , which are already included in (12).

The alternative characterizations for (11)/(12) follow from the fact that $x\in F_{C}\left(u\right)$ if and only if $u\in N_{C}\left(x\right)$ (e.g. Proposition C.3.1.4 in hiriart-lemarechal-2001 ).∎

8.4 Proof of Proposition 6

Proof (of Proposition 6)

Property (28) implies $\bigcup\nolimits_{i=1}^{k}C^{i}\times\left\{e^{i}\right\}\subseteq Q$ which shows $Q\left(\mathcal{C}\right)\subseteq Q$ , $Q\left(\mathcal{C}\right)_{\infty}\subseteq Q_{\infty}$ and (29). $Q\subseteq\mathbb{R}^{n}\times\Delta^{k}$ implies $Q_{\infty}\subseteq\mathbb{R}^{n}\times\left\{0\right\}$ and $Q\left(e^{i}\right)=C^{i}\times\left\{e^{i}\right\}$ further implies that $Q_{\infty}\subseteq C^{1}_{\infty}\times\left\{0\right\}=Q\left(\mathcal{C}\right)_{\infty}$ .

Part 1 implies Part 2 is direct from the definition of $Q\left(\mathcal{C}\right)$ , which together with (29) shows their equivalence. Part 2 implies 3 is direct.

For 3 implies 1 we show that if $Q\left(\mathcal{C}\right)\subsetneq Q$ , then there exist $\tilde{x}\in\mathbb{R}^{n}$ such that $\left(\tilde{x},\frac{1}{k}\mathbf{1}\right)\in Q$ and $\tilde{x}\notin\frac{1}{k}\sum_{i=1}^{k}C^{i}$ . For this we first claim that if $\left(\bar{x},\bar{y}\right)\in Q\setminus Q\left(\mathcal{C}\right)$ then there exist $a\in\mathbb{R}^{n}$ , $b\in\mathbb{R}^{k}$ and $c\in\mathbb{R}$ that satisfy the following three separation conditions: (i) $a\cdot\bar{x}+b\cdot\bar{y}>c$ , (ii) $a\cdot x+b\cdot y\leq c$ for all $\left(x,y\right)\in Q\left(\mathcal{C}\right)$ , and (iii) for all $i\in\left\llbracket k\right\rrbracket$ and $\varepsilon>0$ there exist $\bar{x}^{i}\left(\varepsilon\right)\in C^{i}$ such that $a\cdot\bar{x}^{i}\left(\varepsilon\right)+b\cdot e^{i}\geq c-\varepsilon$ . Indeed the first two follow from the separation theorem for closed convex sets. If the third condition does not hold for some $i\in\left\llbracket k\right\rrbracket$ then $\max\left\{a\cdot x\,:\,x\in C^{i}\right\}<c-b_{i}$ and because $\bar{y}\geq 0$ we can decrease $b_{i}$ to achieve the equality while still satisfying the first two conditions.

Now, because of (29) for $Q=Q\left(\mathcal{C}\right)$ and separation condition (ii) we have

[TABLE]

Additionally, because $\bar{y}\in\Delta^{k}$ there exist $\left(\lambda_{0},\lambda\right)\in\Delta^{k+1}$ with $\lambda_{0}>0$ such that $\lambda_{0}\bar{y}+\sum_{i=1}^{k}\lambda_{i}e^{i}=\frac{1}{k}\mathbf{1}$ . If $\sum_{i=1}^{k}\lambda_{i}=0$ , then $\lambda_{0}=1$ , $\bar{y}=\frac{1}{k}\mathbf{1}$ and $\left(\bar{x},\frac{1}{k}\mathbf{1}\right)\in Q$ . Hence, because of separation condition (i) and (40) we have $\bar{x}\notin\frac{1}{k}\sum_{i=1}^{k}C^{i}$ . If instead we have $\sum_{i=1}^{k}\lambda_{i}>0$ , then there exist $\varepsilon>0$ such that $\lambda_{0}\left(a\cdot\bar{x}+b\cdot\bar{y}-c\right)/\left(\sum_{i=1}^{k}\lambda_{i}\right)>\varepsilon$ because of separation condition (i). For such $\left(\lambda_{0},\lambda\right)$ and $\varepsilon$ let $\left(\tilde{x},\tilde{y}\right)=\lambda_{0}\left(\bar{x},\bar{y}\right)+\sum_{i=1}^{k}\lambda_{i}\left(\bar{x}^{i}\left(\varepsilon\right),e^{i}\right)$ . Because $\left(\bar{x}^{i}\left(\varepsilon\right),e^{i}\right)\in Q$ for each $i\in\left\llbracket k\right\rrbracket$ we then have that $\left(\tilde{x},\frac{1}{k}\mathbf{1}\right)\in Q$ . Furthermore, because separation conditions (i) and (iii), and the condition on $\varepsilon$ we have $a\cdot\tilde{x}+\frac{1}{k}\sum_{i=1}^{k}b_{i}>c$ and hence by (40) we have $\tilde{x}\notin\frac{1}{k}\sum_{i=1}^{k}C^{i}$ . ∎

Acknowledgements.

This research was partially supported by NSF under grant CMMI-1351619. We thank two anonymous referees for their constructive comments that improved the paper’s presentation.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Andradas, C., Ruiz, J.M.: Ubiquity of łojasiewicz’s example of a nonbasic semialgebraic set. The Michigan Mathematical Journal 41 , 465–472 (1994)
2(2) Balas, E.: On the convex-hull of the union of certain polyhedra. Operations Research Letters 7 , 279–283 (1988)
3(3) Ben-Tal, A., Nemirovski, A.: Lectures on modern convex optimization: analysis, algorithms, and engineering applications. Society for Industrial Mathematics (2001)
4(4) Bestuzheva, K., Hijazi, H., Coffrin, C.: Convex relaxations for quadratic on/off constraints and applications to optimal transmission switching (2016). Optimization Online, http://www.optimization-online.org/DB_HTML/2016/07/5565.html .
5(5) Blair, C.: Representation for multiple right-hand sides. Math. Program. 49 , 1–5 (1990)
6(6) Blekherman, G., Parrilo, P., Thomas, R.: Semidefinite Optimization and Convex Algebraic Geometry. MPS-SIAM Series on Optimization. SIAM (2013)
7(7) Bonami, P., Lodi, A., Tramontani, A., Wiese, S.: On mathematical programming with indicator constraints. Math. Program. 151 , 191–223 (2015)
8(8) Ceria, S., Soares, J.: Convex programming for disjunctive convex optimization. Math. Program. 86 , 595–614 (1999)