Convexification of bilinear terms over network polytopes

Erfan Khademnia; Danial Davarnia

arXiv:2302.14151·math.OC·March 27, 2024·Math. Oper. Res.

Convexification of bilinear terms over network polytopes

Erfan Khademnia, Danial Davarnia

PDF

Open Access

TL;DR

This paper develops methods to precisely convexify bilinear terms over network polytopes, improving bounds in network optimization problems by explicitly characterizing the convex hull using network structures.

Contribution

It introduces systematic procedures to obtain the convex hull of bilinear sets over network polytopes, explicitly deriving facet inequalities using tree and forest structures.

Findings

01

The proposed convexification methods improve dual bounds in network optimization.

02

Explicit facet inequalities are derived for bilinear sets over network polytopes.

03

Computational experiments demonstrate the effectiveness of the new convexification techniques.

Abstract

It is well-known that the McCormick relaxation for the bilinear constraint $z = x y$ gives the convex hull over the box domains for $x$ and $y$ . In network applications where the domain of bilinear variables is described by a network polytope, the McCormick relaxation, also referred to as linearization, fails to provide the convex hull and often leads to poor dual bounds. We study the convex hull of the set containing bilinear constraints $z_{i, j} = x_{i} y_{j}$ where $x_{i}$ represents the arc-flow variable in a network polytope, and $y_{j}$ is in a simplex. For the case where the simplex contains a single $y$ variable, we introduce a systematic procedure to obtain the convex hull of the above set in the original space of variables, and show that all facet-defining inequalities of the convex hull can be obtained explicitly through identifying a special tree structure in the underlying network. For…

Tables6

Table 1. Table 1 : Evaluating EC&R cutting planes for the bipartite structure with m = 1 𝑚 1 m=1

Size	#	Opt	LP	Full EC&R		Separation EC&R
				Gap (%)	Time (s)	Gap (%)	Time (s)
10	1	3501	3035.53	99	0.44	99	0.30
	2	1982.63	1828.85	99	0.27	99	0.15
	3	2846	2376.23	91	0.84	90	0.62
	4	3160.86	3056.15	99	0.29	99	0.14
	5	5929	5825.98	99	0.24	99	0.15
	6	4487.71	3743.12	74	0.89	74	0.65
	7	2009.02	1858.60	99	0.33	99	0.37
	8	4531.55	4215.66	91	0.87	83	0.61
	9	3238.87	3015.88	99	0.45	99	0.48
	10	2302	2119.27	98	0.49	91	0.47
	avg.			95	0.51	93	0.40
30	1	2219.82	1976.29	99	22.85	99	3.02
	2	1619.88	1444.26	99	22.79	99	1.49
	3	3678.58	3236.11	99	23.49	99	2.95
	4	3010.56	2544.60	99	41.96	92	5.96
	5	1959.27	1777.84	99	22.68	80	3.11
	6	4315.31	3379.29	90	81.48	99	7.67
	7	854.26	709.06	99	22.70	85	1.58
	8	1604.93	1324.22	99	23.01	99	2.97
	9	309.39	105.43	99	21.95	99	2.76
	10	3224.44	2980.87	99	21.61	99	3.17
	avg.			98	30.45	95	3.47
50	1	2378.15	2012.32	99	170.22	99	14.21
	2	1798.19	1544.38	99	179.22	99	9.23
	3	1100.05	758.45	99	179.77	96	14.86
	4	-1320.24	-2103.91	99	177.30	99	9.67
	5	-2517.20	-2941.66	99	185.33	99	9.53
	6	-3396.62	-3607.08	99	183.70	99	4.64
	7	515.27	7.30	99	188.80	99	9.81
	8	668.71	-126.09	99	181.87	99	13.80
	9	442.04	44.26	99	185.06	99	4.59
	10	-1987.83	-2479.73	99	182.14	99	13.59
	avg.			99	181.34	99	10.39

Table 2. Table 2 : Evaluating EC&R cutting planes for the bipartite structure with m = 2 𝑚 2 m=2

Size	#	Opt	LP	Tree EC&R				Forest EC&R
				Full		Separation		Full		Separation
				Gap	Time	Gap	Time	Gap	Time	Gap	Time
10	1	3820.43	3304.37	72	2.68	66	0.90	80	4.19	76	3.78
	2	4670	3722.12	94	2.74	90	1.26	99	4.06	96	2.85
	3	3046.69	2296.67	88	2.75	86	1.28	99	2.86	99	3.88
	4	2091.19	1617.27	99	1.95	99	1.02	99	1.50	99	1.89
	5	4744.34	3833.32	86	2.06	74	1.25	96	4.15	96	3.81
	6	2994.59	2247.07	93	2.69	93	1.29	99	5.42	98	4.61
	7	4790.34	4366.39	99	1.72	99	0.65	99	1.56	99	1.98
	8	6438.68	5781.10	99	1.58	99	0.77	99	1.50	99	1.91
	9	995.54	514.49	81	1.73	55	0.92	99	1.47	99	1.90
	10	3127.59	2985.04	99	1.35	99	0.36	99	1.53	99	0.94
	avg.			91	2.13	86	0.97	97	2.82	96	2.75
30	1	-243.66	-3690.76	28	165.79	17	9.86	49	497.22	31	36.26
	2	771.06	-2039.13	16	172.91	09	10.42	49	505.36	29	27.15
	3	-1103.88	-3888.50	40	175.02	25	10.43	65	498.75	39	27.14
	4	2397.13	-1259.93	30	170.07	24	10.24	58	495.05	33	27.40
	5	35.46	-2069.69	59	164.67	40	19.38	78	486.38	50	54.06
	6	-535.73	-3742.10	31	209.90	22	17.33	64	494.07	43	45.65
	7	510.26	-2121.24	43	174.23	27	13.67	85	500.75	68	64.31
	8	161.23	-2120.39	29	168.76	16	9.93	67	614.68	36	35.82
	9	2727.77	-355.00	58	172.26	52	14.50	89	494.68	71	55.88
	10	1615.33	-510.79	49	209.28	36	13.17	76	497.71	47	45.85
	avg.			41	178.29	27	12.89	68	508.46	45	41.95
50	1	-4103.47	-9633.82	41	1499.04	24	37.56	64	4248.47	41	195.21
	2	-942.02	-6275.92	60	1110.56	47	47.72	83	3960.46	62	167.52
	3	-3652.90	-11173.54	39	1085.44	26	28.84	60	3923.47	37	131.14
	4	-940.47	-5652.85	61	1006.40	48	56.31	80	3729.93	60	174.41
	5	2006.51	-3895.38	45	1024.55	35	48.22	65	3767.51	38	100.05
	6	-3040.65	-10676.33	45	999.34	32	37.38	62	2918.44	41	100.59
	7	-2074.94	-11708.06	35	1029.83	28	37.67	52	2955.59	39	125.89
	8	-4621.66	-9875.65	42	1046.62	29	38.16	67	3872.93	38	125.75
	9	-1622.87	-7510.43	48	1058.28	36	28.94	72	3881.18	46	125.50
	10	-4396.41	-11995.74	24	1010.64	17	37.90	39	2965.72	26	161.27
	avg.			44	1087.07	32	39.87	64	3622.37	42	134.62

Table 3. Table 3 : Evaluating EC&R cutting planes for the clique structure with m = 1 𝑚 1 m=1

Size	#	Opt	LP	Full EC&R		Separation EC&R
				Gap (%)	Time (s)	Gap (%)	Time (s)
10	1	6619.00	6291.72	99	2.91	78	0.30
	2	5027.37	4925.55	99	4.28	00	0.08
	3	6118.96	5957.02	99	1.57	99	0.23
	4	3494.89	3391.59	99	1.54	99	0.10
	5	2240.41	1843.81	99	1.59	99	0.17
	6	1765.00	1603.52	99	1.52	81	0.45
	7	4129.00	3939.31	99	1.51	99	0.19
	8	5514.99	5128.99	99	1.59	88	0.40
	9	8396.64	8260.95	99	1.51	99	0.10
	10	4402.24	4156.90	99	1.48	92	0.25
	avg.			99	1.95	83	0.23
20	1	6176.00	6008.36	99	36.31	99	1.69
	2	8734.00	8530.56	99	34.06	99	1.66
	3	10195.00	9492.98	86	99.73	73	6.84
	4	7020.00	6626.42	99	67.61	97	8.53
	5	7917.00	7339.43	99	97.31	95	12.27
	6	9154.00	8571.02	99	35.18	99	1.78
	7	9117.00	8849.31	99	35.13	99	1.79
	8	9804.00	9138.27	99	65.62	99	8.53
	9	5876.17	5463.65	99	66.50	99	3.36
	10	7605.72	6822.39	98	132.54	91	6.88
	avg.			98	67.00	95	5.33
30	1	7673.11	6899.80	99	360.89	88	66.99
	2	6376.51	5191.96	99	190.06	99	29.07
	3	11341.00	10650.40	99	166.24	63	28.61
	4	9045.00	7826.18	95	619.64	78	65.26
	5	6094.86	5289.02	95	636.07	87	39.74
	6	7641.00	7222.33	99	321.02	95	37.95
	7	11040.00	10413.28	99	163.22	99	35.56
	8	6862.00	6411.01	99	175.81	99	18.61
	9	7424.00	6295.28	94	786.84	85	73.03
	10	6102.74	4945.62	99	357.86	80	56.56
	avg.			98	377.77	87	45.14

Table 4. Table 4 : Evaluating EC&R cutting planes for the clique structure with m = 2 𝑚 2 m=2

Size	#	Opt	LP	Tree EC&R				Forest EC&R
				Full		Separation		Full		Separation
				Gap	Time	Gap	Time	Gap	Time	Gap	Time
10	1	5177.00	4620.61	99	3.25	99	1.40	99	9.52	99	4.41
	2	276.13	-682.07	80	11.86	76	5.78	91	25.73	88	17.34
	3	5444.39	5046.27	99	5.83	98	2.82	99	17.13	99	8.96
	4	6699.48	5954.44	97	9.06	96	4.34	99	17.43	97	13.56
	5	4897.99	4244.31	99	6.11	99	4.40	99	9.52	99	4.41
	6	5690.00	5448.84	99	3.17	99	1.37	99	9.31	99	4.46
	7	3744.00	3081.42	99	5.53	99	2.73	99	9.12	99	8.65
	8	5840.00	5591.10	99	9.02	99	4.31	99	17.52	99	8.46
	9	6995.26	6758.82	99	3.23	99	1.44	99	9.55	99	4.29
	10	3576.71	3254.87	99	3.13	99	1.37	99	9.80	99	4.11
	avg.			97	6.02	96	3.00	98	13.46	98	7.86
20	1	7519.86	5941.60	80	247.21	68	16.70	99	567.28	86	93.92
	2	6940.38	4598.74	45	259.08	25	16.97	68	750.97	50	65.45
	3	6661.91	5093.44	63	261.60	41	10.15	99	382.93	86	47.06
	4	5529.50	3884.97	72	259.85	60	17.14	98	734.08	83	45.56
	5	7517.72	6460.06	76	191.48	56	14.08	99	523.53	76	45.62
	6	6691.37	5225.75	94	315.58	72	19.68	99	515.09	86	54.63
	7	9608.17	7674.57	73	192.52	58	13.29	95	525.21	79	45.56
	8	9808.00	7723.56	60	252.08	48	13.57	79	661.85	67	46.10
	9	5940.59	3654.28	55	198.98	37	16.23	80	679.97	62	5.14
	10	7332.00	6126.00	99	266.47	93	13.47	99	352.75	94	27.71
	avg.			71	236.64	56	15.13	92	569.37	77	52.68
30	1	6912.00	5808.28	99	426.04	94	84.48	99	1062.91	99	164.67
	2	11003.69	9877.86	95	1177.46	87	86.41	96	2988.85	92	224.89
	3	6676.09	4151.90	84	1815.14	64	88.16	99	1919.25	93	280.02
	4	8375.87	5882.49	90	1420.13	69	107.86	98	4084.95	80	282.41
	5	9854.24	7338.24	99	644.18	98	88.94	99	1125.23	99	167.50
	6	7613.34	5011.54	62	1538.90	46	111.18	81	4004.75	64	367.75
	7	11367.00	8704.43	81	1266.64	65	129.98	91	5027.96	70	255.62
	8	8179.74	5631.34	73	1277.91	57	93.64	92	3662.67	76	333.25
	9	8350.00	4983.29	74	1588.36	50	90.22	91	3576.49	71	328.55
	10	10606.00	7477.32	75	1876.30	56	103.98	98	4662.70	82	360.30
	avg.			83	1303.11	69	98.48	94	3211.58	83	276.49

Table 5. Table 5 : Evaluating EC&R cutting planes for the cycle structure with m = 1 𝑚 1 m=1

Size	#	Opt	LP	Full EC&R
				Gap (%)	Time (s)
50	1	2867.15	-80.28	99	0.04
	2	5797.12	2704.93	99	0.01
	3	-1552.82	-2147.89	99	0.03
	4	5365.32	3288.67	99	0.02
	5	980.96	-1389.79	99	0.01
	6	-266.16	-2592.61	99	0.03
	7	1250.60	-354.21	99	0.02
	8	-3119.87	-4349.18	99	0.02
	9	733.80	-118.27	99	0.01
	10	2210.15	243.57	99	0.03
	avg.			99	0.02
100	1	-1783.35	-3668.73	99	0.06
	2	-2069.08	-3695.66	99	0.03
	3	8678.79	1677.03	99	0.03
	4	-4346.53	-5407.01	99	0.05
	5	1243.70	-2122.95	99	0.03
	6	15120.43	4885.57	99	0.03
	7	1235.43	-3377.43	99	0.03
	8	1256.49	-1378.17	99	0.03
	9	11766.20	4017.61	99	0.05
	10	3647.78	-435.92	99	0.04
	avg.			99	0.04
200	1	366.47	-7118.60	99	0.08
	2	-7235.71	-12323.90	99	0.12
	3	3544.59	-5999.36	99	0.07
	4	10746.70	1156.33	99	0.07
	5	12870.76	-188.56	99	0.09
	6	11889.94	3498.13	99	0.07
	7	3482.99	-4362.33	99	0.08
	8	-1885.01	-10801.39	99	0.06
	9	22166.25	11341.87	99	0.08
	10	-348.45	-8025.35	99	0.09
	avg.			99	0.08

Table 6. Table 6 : Evaluating EC&R cutting planes for the cycle structure with m = 2 𝑚 2 m=2

Size	#	Opt	LP	Full Tree EC&R		Full Forest EC&R
				Gap	Time	Gap	Time
50	1	-1590.18	-8576.67	87	0.13	99	0.26
	2	-4622.80	-13593.67	85	0.10	96	0.32
	3	-7794.96	-15165.05	63	0.11	87	0.37
	4	-1503.33	-6513.07	89	0.16	99	0.27
	5	4433.70	-6508.12	94	0.12	99	0.23
	6	-284.46	-7786.90	70	0.11	91	0.37
	7	-1140.91	-14831.40	60	0.09	82	0.32
	8	6625.82	-1629.65	93	0.09	99	0.23
	9	-2116.06	-5866.54	99	0.03	99	0.17
	10	88.65	-8106.67	89	0.09	97	0.33
	avg.			83	0.10	95	0.29
100	1	900.68	-16607.11	74	0.22	90	0.71
	2	140.56	-17583.33	92	0.22	99	0.31
	3	-131.98	-20668.01	73	0.26	87	0.66
	4	-231.62	-20736.33	80	0.34	95	0.63
	5	-1709.59	-18456.56	66	0.24	81	0.68
	6	6992.68	-16783.92	74	0.17	85	0.66
	7	1326.26	-20135.15	75	0.18	82	0.63
	8	90.11	-19442.39	72	0.17	86	0.67
	9	7523.83	-12600.35	88	0.27	99	0.55
	10	-2365.06	-18051.56	77	0.19	96	0.72
	avg.			77	0.23	90	0.62
200	1	-4498.84	-42056.21	88	0.42	95	1.37
	2	-2270.67	-41823.19	77	0.40	86	1.49
	3	9853.81	-38627.76	70	0.42	82	1.48
	4	10420.28	-24979.47	74	0.45	88	1.35
	5	-10699.73	-46329.12	77	0.43	94	1.34
	6	1474.74	-34978.20	86	0.39	99	1.08
	7	2446.84	-36567.37	78	0.44	95	1.37
	8	-7058.24	-40441.54	85	0.40	99	1.01
	9	-14652.28	-48186.48	74	0.41	93	1.37
	10	9392.03	-28443.66	81	0.42	94	1.51
	avg.			79	0.42	93	1.34

Equations46

S = {(x; y; z) \in Ξ \times Δ_{m} \times R^{κ} y^{⊺} A^{k} x = z_{k}, \forall k \in K},

S = {(x; y; z) \in Ξ \times Δ_{m} \times R^{κ} y^{⊺} A^{k} x = z_{k}, \forall k \in K},

\left.\begin{array}[]{lll}&\pm A^{k}_{j.}\bm{w}^{j}\mp v^{j}_{k}\geq 0,&\forall(k,j)\in K\times M\\ &\mp\left(z_{k}-\sum_{j\in M}v^{j}_{k}\right)\geq 0,&\forall k\in K\\ &E\bm{w}^{j}\geq\bm{f}y_{j},&\forall j\in M\\ &E\left(\bm{x}-\sum_{j\in M}\bm{w}^{j}\right)\geq\bm{f}\left(1-\sum_{j\in M}y_{j}\right),\\ &\bm{0}\leq\bm{w}^{j}\leq\bm{u}y_{j},&\forall j\in M\\ &\bm{0}\leq\bm{x}-\sum_{j\in M}\bm{w}^{j}\leq\bm{u}\left(1-\sum_{j\in M}y_{j}\right).\end{array}\right.

\left.\begin{array}[]{lll}&\pm A^{k}_{j.}\bm{w}^{j}\mp v^{j}_{k}\geq 0,&\forall(k,j)\in K\times M\\ &\mp\left(z_{k}-\sum_{j\in M}v^{j}_{k}\right)\geq 0,&\forall k\in K\\ &E\bm{w}^{j}\geq\bm{f}y_{j},&\forall j\in M\\ &E\left(\bm{x}-\sum_{j\in M}\bm{w}^{j}\right)\geq\bm{f}\left(1-\sum_{j\in M}y_{j}\right),\\ &\bm{0}\leq\bm{w}^{j}\leq\bm{u}y_{j},&\forall j\in M\\ &\bm{0}\leq\bm{x}-\sum_{j\in M}\bm{w}^{j}\leq\bm{u}\left(1-\sum_{j\in M}y_{j}\right).\end{array}\right.

i \in N \sum q_{i} (π) x_{i} + j \in M \sum r_{j} (π) y_{j} + k \in K \sum s_{k} (π) z_{k} \geq t (π),

i \in N \sum q_{i} (π) x_{i} + j \in M \sum r_{j} (π) y_{j} + k \in K \sum s_{k} (π) z_{k} \geq t (π),

\begin{array}[]{rll}q_{i}(\bm{\pi})&=&\sum_{t\in T}E_{ti}\theta_{t}+\lambda_{i}-\mu_{i}\\ r_{j}(\bm{\pi})&=&\sum_{t\in T}f_{t}\left(\theta_{t}-\gamma^{j}_{t}\right)+\sum_{i\in N}\left(\rho^{j}_{i}-\mu_{i}\right)\\ s_{k}(\bm{\pi})&=&-\left(\beta^{+}_{k}-\beta^{-}_{k}\right)\\ t(\bm{\pi})&=&\sum_{t\in T}f_{t}\theta_{t}-\sum_{i\in N}\mu_{i},\end{array}

\begin{array}[]{rll}q_{i}(\bm{\pi})&=&\sum_{t\in T}E_{ti}\theta_{t}+\lambda_{i}-\mu_{i}\\ r_{j}(\bm{\pi})&=&\sum_{t\in T}f_{t}\left(\theta_{t}-\gamma^{j}_{t}\right)+\sum_{i\in N}\left(\rho^{j}_{i}-\mu_{i}\right)\\ s_{k}(\bm{\pi})&=&-\left(\beta^{+}_{k}-\beta^{-}_{k}\right)\\ t(\bm{\pi})&=&\sum_{t\in T}f_{t}\theta_{t}-\sum_{i\in N}\mu_{i},\end{array}

C = {π \in R_{+}^{2 κ + (m + 1) (τ + 2 n)} k \in K \sum A_{j i}^{k} (β_{k}^{+} - β_{k}^{-}) + t \in T \sum E_{t i} (γ_{t}^{j} - θ_{t}) + η_{i}^{j} - ρ_{i}^{j} - λ_{i} + μ_{i} = 0, \forall (i, j) \in N \times M}

C = {π \in R_{+}^{2 κ + (m + 1) (τ + 2 n)} k \in K \sum A_{j i}^{k} (β_{k}^{+} - β_{k}^{-}) + t \in T \sum E_{t i} (γ_{t}^{j} - θ_{t}) + η_{i}^{j} - ρ_{i}^{j} - λ_{i} + μ_{i} = 0, \forall (i, j) \in N \times M}

C^{l} = {π^{l} \in R_{+}^{2 (κ - 1) + (m + 1) (τ + 2 n)} k \in K_{l} \sum A_{j i}^{k} (β_{k}^{+} - β_{k}^{-}) + t \in T \sum E_{t i} (γ_{t}^{j} - θ_{t}) + η_{i}^{j} - ρ_{i}^{j} - λ_{i} + μ_{i} = \pm A_{j i}^{l}, \forall (i, j) \in N \times M},

C^{l} = {π^{l} \in R_{+}^{2 (κ - 1) + (m + 1) (τ + 2 n)} k \in K_{l} \sum A_{j i}^{k} (β_{k}^{+} - β_{k}^{-}) + t \in T \sum E_{t i} (γ_{t}^{j} - θ_{t}) + η_{i}^{j} - ρ_{i}^{j} - λ_{i} + μ_{i} = \pm A_{j i}^{l}, \forall (i, j) \in N \times M},

C^{l} = {π^{l} \in R_{+}^{2 (κ - 1) + 2 (τ + 2 n)} t \in T \sum E_{t i} (γ_{t}^{1} - θ_{t}) + μ_{i} - λ_{i} + η_{i}^{1} - ρ_{i}^{1} + k \in K_{l} \sum A_{1 i}^{k} (β_{k}^{+} - β_{k}^{-}) = \pm A_{1 i}^{l}, \forall i \in N} .

C^{l} = {π^{l} \in R_{+}^{2 (κ - 1) + 2 (τ + 2 n)} t \in T \sum E_{t i} (γ_{t}^{1} - θ_{t}) + μ_{i} - λ_{i} + η_{i}^{1} - ρ_{i}^{1} + k \in K_{l} \sum A_{1 i}^{k} (β_{k}^{+} - β_{k}^{-}) = \pm A_{1 i}^{l}, \forall i \in N} .

\Bigl{[}\begin{array}[]{c|c|c|c|c|c|c|c}E^{{}^{\intercal}}&\,-E^{{}^{\intercal}}&\,I&\,-I&\,I&\,-I&\,\bar{I}&\,-\bar{I}\end{array}\Bigr{]}.

\Bigl{[}\begin{array}[]{c|c|c|c|c|c|c|c}E^{{}^{\intercal}}&\,-E^{{}^{\intercal}}&\,I&\,-I&\,I&\,-I&\,\bar{I}&\,-\bar{I}\end{array}\Bigr{]}.

\left[\begin{array}[]{ccc|ccc|c}\pm E_{1}&\pm I_{1}&\pm\bar{I}_{1}&\,0&0&0&\,C_{1}\\ \hline\cr 0&0&0&\pm E_{2}&\pm I_{2}&\pm\bar{I}_{2}&\,C_{2}\\ \hline\cr\hline\cr 0&0&0&0&0&0&\,C_{3}\\ \end{array}\right]\left[\begin{array}[]{c}\bm{1}\\ \hline\cr\bm{1}\\ \hline\cr\bm{0}\end{array}\right]=\left[\begin{array}[]{c}\pm\bm{e}^{l}\\ \hline\cr\bm{0}\\ \hline\cr\hline\cr\bm{0}\end{array}\right],

\left[\begin{array}[]{ccc|ccc|c}\pm E_{1}&\pm I_{1}&\pm\bar{I}_{1}&\,0&0&0&\,C_{1}\\ \hline\cr 0&0&0&\pm E_{2}&\pm I_{2}&\pm\bar{I}_{2}&\,C_{2}\\ \hline\cr\hline\cr 0&0&0&0&0&0&\,C_{3}\\ \end{array}\right]\left[\begin{array}[]{c}\bm{1}\\ \hline\cr\bm{1}\\ \hline\cr\bm{0}\end{array}\right]=\left[\begin{array}[]{c}\pm\bm{e}^{l}\\ \hline\cr\bm{0}\\ \hline\cr\hline\cr\bm{0}\end{array}\right],

- z_{1, 5} - y_{1} x_{2, 3} - y_{1} x_{4, 3} + (f_{8} + f_{2} + f_{1} + f_{4} + f_{6}) y_{1} + x_{1, 5} - x_{2, 1} + x_{4, 3} - x_{8, 4} + x_{6, 2} - f_{1} - f_{4} - f_{6} \geq 0

- z_{1, 5} - y_{1} x_{2, 3} - y_{1} x_{4, 3} + (f_{8} + f_{2} + f_{1} + f_{4} + f_{6}) y_{1} + x_{1, 5} - x_{2, 1} + x_{4, 3} - x_{8, 4} + x_{6, 2} - f_{1} - f_{4} - f_{6} \geq 0

S^{1} = {(x; y; z) \in Ξ \times Δ_{1} \times R^{κ} y_{1} A^{k} x = z_{k}, \forall k \in K},

S^{1} = {(x; y; z) \in Ξ \times Δ_{1} \times R^{κ} y_{1} A^{k} x = z_{k}, \forall k \in K},

\Bigl{[}\begin{array}[]{c|c|c|c|c|c|c|c}E^{{}^{\intercal}}&\,-E^{{}^{\intercal}}&\,I&\,-I&\,I&\,-I&\,\tilde{A}&\,-\tilde{A}\end{array}\Bigr{]}.

\Bigl{[}\begin{array}[]{c|c|c|c|c|c|c|c}E^{{}^{\intercal}}&\,-E^{{}^{\intercal}}&\,I&\,-I&\,I&\,-I&\,\tilde{A}&\,-\tilde{A}\end{array}\Bigr{]}.

S^{1} = {(x; y; w) \in Ξ \times Δ_{1} \times R^{n} ∣ y_{1} x_{i} = w_{i}, \forall i \in N},

S^{1} = {(x; y; w) \in Ξ \times Δ_{1} \times R^{n} ∣ y_{1} x_{i} = w_{i}, \forall i \in N},

D = {(x; y; w; z) \in Ξ \times Δ_{1} \times R^{n} \times R^{κ} A^{k} w = z_{k}, \forall k \in K},

D = {(x; y; w; z) \in Ξ \times Δ_{1} \times R^{n} \times R^{κ} A^{k} w = z_{k}, \forall k \in K},

conv ((S^{1} \times R^{κ}) \cap D) = (conv (S^{1}) \times R^{κ}) \cap D .

conv ((S^{1} \times R^{κ}) \cap D) = (conv (S^{1}) \times R^{κ}) \cap D .

\displaystyle\left[\begin{array}[]{c|c|c|c||c||c|c||cc|cc|c|cc||cc|cc|c|cc}E^{{}^{\intercal}}&\bm{0}&\,\dotsc&\,\bm{0}&\,\,-E^{{}^{\intercal}}&I&\,-I&\,I&\,-I&\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\bar{I}^{1}&\,-\bar{I}^{1}&\,\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}\\ \hline\cr\,\bm{0}&E^{{}^{\intercal}}&\,\dotsc&\,\bm{0}&\,-E^{{}^{\intercal}}&\,I&\,-I&\bm{0}&\,\bm{0}&\,I&\,-I&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\bm{0}&\bar{I}^{2}&\,-\bar{I}^{2}&\,\dotsc&\,\bm{0}&\,\bm{0}\\ \hline\cr\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots&\vdots&\,\vdots&\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots&\,\vdots&\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots\\ \hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&E^{{}^{\intercal}}&\,-E^{{}^{\intercal}}&\,I&\,-I&\bm{0}&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\dotsc&\,I&\,-I&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\dotsc&\bar{I}^{m}&\,-\bar{I}^{m}\\ \end{array}\right].

\displaystyle\left[\begin{array}[]{c|c|c|c||c||c|c||cc|cc|c|cc||cc|cc|c|cc}E^{{}^{\intercal}}&\bm{0}&\,\dotsc&\,\bm{0}&\,\,-E^{{}^{\intercal}}&I&\,-I&\,I&\,-I&\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\bar{I}^{1}&\,-\bar{I}^{1}&\,\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}\\ \hline\cr\,\bm{0}&E^{{}^{\intercal}}&\,\dotsc&\,\bm{0}&\,-E^{{}^{\intercal}}&\,I&\,-I&\bm{0}&\,\bm{0}&\,I&\,-I&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\bm{0}&\bar{I}^{2}&\,-\bar{I}^{2}&\,\dotsc&\,\bm{0}&\,\bm{0}\\ \hline\cr\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots&\vdots&\,\vdots&\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots&\,\vdots&\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots\\ \hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&E^{{}^{\intercal}}&\,-E^{{}^{\intercal}}&\,I&\,-I&\bm{0}&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\dotsc&\,I&\,-I&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\bm{0}&\,\dotsc&\bar{I}^{m}&\,-\bar{I}^{m}\\ \end{array}\right].

\displaystyle\left[\begin{array}[]{c|c|c|c||c||c||ccccc||ccccc||c}\begin{array}[]{ccc}E^{{}^{\intercal}}_{1,1}&0&0\\ 0&\ddots&0\\ 0&0&E^{{}^{\intercal}}_{1,|\Gamma_{1}|}\end{array}&\bm{0}&\,\dotsc&\,\bm{0}&\,\,-E^{{}^{\intercal}}&\pm I&\,\pm I&\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\pm\bar{I}^{1}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,C_{1}\\ \hline\cr\,\bm{0}&\begin{array}[]{ccc}E^{{}^{\intercal}}_{2,1}&0&0\\ 0&\ddots&0\\ 0&0&E^{{}^{\intercal}}_{2,|\Gamma_{2}|}\end{array}&\,\dotsc&\,\bm{0}&\,-E^{{}^{\intercal}}&\,\pm I&\bm{0}&\,\pm I&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\bm{0}&\pm\bar{I}^{2}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,C_{2}\\ \hline\cr\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\vdots&\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots\\ \hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&\begin{array}[]{ccc}E^{{}^{\intercal}}_{m,1}&0&0\\ 0&\ddots&0\\ 0&0&E^{{}^{\intercal}}_{m,|\Gamma_{m}|}\end{array}&\,-E^{{}^{\intercal}}&\,\pm I&\bm{0}&\,\bm{0}&\,\dotsc&\,\pm I&\bm{0}&\bm{0}&\,\bm{0}&\,\dotsc&\pm\bar{I}^{m}&\,\bm{0}&\,C_{m}\\ \hline\cr\hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,-E^{{}^{\intercal}}&\,\pm I&\bm{0}&\,\bm{0}&\,\dotsc&\bm{0}&\,\pm I&\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\pm\bar{I}^{1,\dotsc,m}&\,C_{m+1}\\ \hline\cr\hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&\bm{0}&\,\bm{0}&\,\bm{0}&\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\bm{0}&\,\bm{0}&\,\dotsc&\bm{0}&\bm{0}&\,C_{m+2}\end{array}\right],

\displaystyle\left[\begin{array}[]{c|c|c|c||c||c||ccccc||ccccc||c}\begin{array}[]{ccc}E^{{}^{\intercal}}_{1,1}&0&0\\ 0&\ddots&0\\ 0&0&E^{{}^{\intercal}}_{1,|\Gamma_{1}|}\end{array}&\bm{0}&\,\dotsc&\,\bm{0}&\,\,-E^{{}^{\intercal}}&\pm I&\,\pm I&\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\pm\bar{I}^{1}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,C_{1}\\ \hline\cr\,\bm{0}&\begin{array}[]{ccc}E^{{}^{\intercal}}_{2,1}&0&0\\ 0&\ddots&0\\ 0&0&E^{{}^{\intercal}}_{2,|\Gamma_{2}|}\end{array}&\,\dotsc&\,\bm{0}&\,-E^{{}^{\intercal}}&\,\pm I&\bm{0}&\,\pm I&\,\dotsc&\,\bm{0}&\,\bm{0}&\,\bm{0}&\pm\bar{I}^{2}&\,\dotsc&\,\bm{0}&\,\bm{0}&\,C_{2}\\ \hline\cr\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\vdots&\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots&\,\vdots&\,\ddots&\,\vdots&\,\vdots&\,\vdots\\ \hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&\begin{array}[]{ccc}E^{{}^{\intercal}}_{m,1}&0&0\\ 0&\ddots&0\\ 0&0&E^{{}^{\intercal}}_{m,|\Gamma_{m}|}\end{array}&\,-E^{{}^{\intercal}}&\,\pm I&\bm{0}&\,\bm{0}&\,\dotsc&\,\pm I&\bm{0}&\bm{0}&\,\bm{0}&\,\dotsc&\pm\bar{I}^{m}&\,\bm{0}&\,C_{m}\\ \hline\cr\hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,-E^{{}^{\intercal}}&\,\pm I&\bm{0}&\,\bm{0}&\,\dotsc&\bm{0}&\,\pm I&\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\pm\bar{I}^{1,\dotsc,m}&\,C_{m+1}\\ \hline\cr\hline\cr\,\bm{0}&\,\bm{0}&\,\dotsc&\bm{0}&\,\bm{0}&\,\bm{0}&\bm{0}&\,\bm{0}&\,\dotsc&\,\bm{0}&\,\bm{0}&\bm{0}&\,\bm{0}&\,\dotsc&\bm{0}&\bm{0}&\,C_{m+2}\end{array}\right],

\left[\begin{array}[]{ccc|ccc|c}\pm E_{1}&\pm I_{1}&\pm\bar{I}_{1}&\,0&0&0&\,\bar{C}_{1}\\ \hline\cr 0&0&0&\pm E_{2}&\pm I_{2}&\pm\bar{I}_{2}&\,\bar{C}_{2}\\ \hline\cr\hline\cr 0&0&0&\,0&0&0&\,C_{m+2}\\ \end{array}\right]\left[\begin{array}[]{c}\bm{+}\\ \hline\cr\bm{+}\\ \hline\cr\bm{0}\end{array}\right]=\left[\begin{array}[]{c}\pm\bm{e}^{l}\\ \hline\cr\bm{0}\\ \hline\cr\hline\cr\bm{0}\end{array}\right],

\left[\begin{array}[]{ccc|ccc|c}\pm E_{1}&\pm I_{1}&\pm\bar{I}_{1}&\,0&0&0&\,\bar{C}_{1}\\ \hline\cr 0&0&0&\pm E_{2}&\pm I_{2}&\pm\bar{I}_{2}&\,\bar{C}_{2}\\ \hline\cr\hline\cr 0&0&0&\,0&0&0&\,C_{m+2}\\ \end{array}\right]\left[\begin{array}[]{c}\bm{+}\\ \hline\cr\bm{+}\\ \hline\cr\bm{0}\end{array}\right]=\left[\begin{array}[]{c}\pm\bm{e}^{l}\\ \hline\cr\bm{0}\\ \hline\cr\hline\cr\bm{0}\end{array}\right],

\left[\begin{array}[]{c|c|c}P_{1}&\,0&\,C_{1}\\ \hline\cr P_{2}&\,\pm I&\,C_{2}\\ \hline\cr 0&\,0&\,C_{3}\\ \end{array}\right]\left[\begin{array}[]{c}\bar{\bm{\pi}}^{l}_{1}\\ \hline\cr\bar{\bm{\pi}}^{l}_{2}\\ \hline\cr\bm{0}\end{array}\right]=\left[\begin{array}[]{c}\pm\bm{e}^{l}\\ \hline\cr\bm{0}\\ \hline\cr\bm{0}\end{array}\right],

\left[\begin{array}[]{c|c|c}P_{1}&\,0&\,C_{1}\\ \hline\cr P_{2}&\,\pm I&\,C_{2}\\ \hline\cr 0&\,0&\,C_{3}\\ \end{array}\right]\left[\begin{array}[]{c}\bar{\bm{\pi}}^{l}_{1}\\ \hline\cr\bar{\bm{\pi}}^{l}_{2}\\ \hline\cr\bm{0}\end{array}\right]=\left[\begin{array}[]{c}\pm\bm{e}^{l}\\ \hline\cr\bm{0}\\ \hline\cr\bm{0}\end{array}\right],

\left[\begin{array}[]{c|c|c|c}\pm 1&0&\cdots&0\\ \hline\cr\{0,\pm 1\}&\pm 1&\cdots&0\\ \hline\cr\vdots&\vdots&\ddots&\vdots\\ \hline\cr\{0,\pm 1\}&\{0,\pm 1\}&\cdots&\pm 1\\ \hline\cr\{0,\pm 1\}&\{0,\pm 1\}&\cdots&\{0,\pm 1\}\\ \hline\cr\vdots&\vdots&\ddots&\vdots\\ \hline\cr\{0,\pm 1\}&\{0,\pm 1\}&\cdots&\{0,\pm 1\}\end{array}\right]\bar{\bar{\bm{\pi}}}^{l}_{1}=\pm\bm{e}^{1},

\left[\begin{array}[]{c|c|c|c}\pm 1&0&\cdots&0\\ \hline\cr\{0,\pm 1\}&\pm 1&\cdots&0\\ \hline\cr\vdots&\vdots&\ddots&\vdots\\ \hline\cr\{0,\pm 1\}&\{0,\pm 1\}&\cdots&\pm 1\\ \hline\cr\{0,\pm 1\}&\{0,\pm 1\}&\cdots&\{0,\pm 1\}\\ \hline\cr\vdots&\vdots&\ddots&\vdots\\ \hline\cr\{0,\pm 1\}&\{0,\pm 1\}&\cdots&\{0,\pm 1\}\end{array}\right]\bar{\bar{\bm{\pi}}}^{l}_{1}=\pm\bm{e}^{1},

- z_{1, 5} - y_{1} x_{4, 5} - y_{1} x_{3, 7} + y_{1} x_{4, 3} - y_{2} x_{1, 5} + y_{2} x_{4, 5} + y_{2} x_{2, 3} - y_{2} x_{3, 7} + (f_{1} + f_{2} + f_{3} + f_{6} + f_{8} - u_{8, 4}) y_{1} + (f_{1} + f_{3} + f_{4} - u_{8, 4}) y_{2} + x_{3, 7} - x_{2, 3} - x_{4, 3} - x_{8, 4} - f_{3} + u_{8, 4} \geq 0,

- z_{1, 5} - y_{1} x_{4, 5} - y_{1} x_{3, 7} + y_{1} x_{4, 3} - y_{2} x_{1, 5} + y_{2} x_{4, 5} + y_{2} x_{2, 3} - y_{2} x_{3, 7} + (f_{1} + f_{2} + f_{3} + f_{6} + f_{8} - u_{8, 4}) y_{1} + (f_{1} + f_{3} + f_{4} - u_{8, 4}) y_{2} + x_{3, 7} - x_{2, 3} - x_{4, 3} - x_{8, 4} - f_{3} + u_{8, 4} \geq 0,

max {i \in N \sum r_{i} x_{i} + k \in K \sum c_{k} z_{k} ∣ (x, y, z) \in S_{L}},

max {i \in N \sum r_{i} x_{i} + k \in K \sum c_{k} z_{k} ∣ (x, y, z) \in S_{L}},

max {i \in N \sum r_{i} x_{i} + k \in K \sum c_{k} z_{k} ∣ (x, y, z) \in S} .

max {i \in N \sum r_{i} x_{i} + k \in K \sum c_{k} z_{k} ∣ (x, y, z) \in S} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNuclear Receptors and Signaling · Advanced Optimization Algorithms Research · Optimization and Variational Analysis

Full text

\NatBibNumeric\TheoremsNumberedThrough\EquationsNumberedThrough

\RUNAUTHOR

Khademnia and Davarnia

\RUNTITLE

Convexification of Bilinear Terms over Network Polytopes

\TITLE

Convexification of Bilinear Terms over Network Polytopes

\ARTICLEAUTHORS\AUTHOR

Erfan Khademnia \AFFIndustrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, USA, \[email protected], \AUTHORDanial Davarnia \AFFIndustrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, USA, \[email protected], \ABSTRACTIt is well-known that the McCormick relaxation for the bilinear constraint $z=xy$ gives the convex hull over the box domains for $x$ and $y$ . In network applications where the domain of bilinear variables is described by a network polytope, the McCormick relaxation, also referred to as linearization, fails to provide the convex hull and often leads to poor dual bounds. We study the convex hull of the set containing bilinear constraints $z_{i,j}=x_{i}y_{j}$ where $x_{i}$ represents the arc-flow variable in a network polytope, and $y_{j}$ is in a simplex. For the case where the simplex contains a single $y$ variable, we introduce a systematic procedure to obtain the convex hull of the above set in the original space of variables, and show that all facet-defining inequalities of the convex hull can be obtained explicitly through identifying a special tree structure in the underlying network. For the generalization where the simplex contains multiple $y$ variables, we design a constructive procedure to obtain an important class of facet-defining inequalities for the convex hull of the underlying bilinear set that is characterized by a special forest structure in the underlying network. Computational experiments are presented to evaluate the effectiveness of the proposed methods.

\KEYWORDS

Network problems; Bilinear terms; McCormick relaxations; Disjunctive programming; Cutting planes

1 Introduction

Bilinear constraints in conjunction with network models appear in various mixed-integer and nonlinear programming (MINLP) applications, such as the fixed-charge network flow problems [17], and network models with complementarity constraints, such as transportation problems with conflicts [11]. A common occurrence of bilinear terms in network models pertains to bilevel network problems after being reformulated as a single-level program through either using a dual formulation of the inner problem or incorporating optimality conditions inside the outer problem [4, 6]. These reformulation approaches are widely used in the network interdiction problems, where newly added bilinear terms are relaxed using a linearization technique based on the McCormick bounds [16] over a box domain; see [20] for an exposition. While these relaxations provide the convex hull of the bilinear constraint over a box domain [1], they often lead to weak relaxations when the variables domain becomes more complicated as in general polyhedra [12, 7]. It has been shown [5, 15] that even when the number of bilinear terms in the underlying function increases, the McCormick bounds can be very poor compared to the ones obtained from the convex and concave envelopes of the function.

There are various studies in the literature that develop convexification methods for multilinear functions as a generalization of bilinear forms, but the side constraints for the involved variables are often limited to variable bounds. For instance, [9] introduces a new class of valid inequalities, called running intersection inequalities, for the multilinear polytope described by a set of multilinear equations. The authors in [13] derive extended formulations for the convex hull of the graph of a bilinear function on the $n$ -dimensional unit cube through identifying the facets of the Boolean Quadratic Polytope. In [10], an efficient method to convexify bilinear functions through McCormick relaxations is proposed which takes advantage of the structural convexity in a symmetric quadratic form. Other works in the literature consider a polyhedral, often triangular, subdivision of the domain to derive strong valid inequalities for a bilinear set; see [21, 19, 14] for examples of such approaches. Further, [8] proposes a constructive procedure, referred to as extended cancel-and-relax (EC&R), to simultaneously convexify the graph of bilinear functions over a general polytope structure. In this paper, we make use of the EC&R procedure to derive convexification results for a bilinear set where the side constraints on variables are described by a network flow model as defined next.

For $N:=\{1,\dotsc,n\}$ , $M:=\{1,\dotsc,m\}$ , $K:=\{1,\dotsc,\kappa\}$ , and $T:=\{1,\dotsc,\tau\}$ , we consider

[TABLE]

where $\Xi=\left\{\bm{x}\in{\mathbb{R}}^{n}\,\middle|\,E\bm{x}\geq\bm{f},\,\bm{0}\leq\bm{x}\leq\bm{u}\right\}$ is a primal network polytope, and $\Delta_{m}=\left\{\bm{y}\in{\mathbb{R}}_{+}^{m}\,\middle|\,\bm{1}^{\intercal}\bm{y}\leq 1\right\}$ is a simplex. When variables $\bm{y}$ are binary, $\Delta_{m}$ represents a special ordered set of type I (SOS1); see [3] for an exposition. Such simplex structures appear in various applications and can be obtained by reformulating the underlying polytopes through extreme point decomposition; see [8] for a detailed account. In the above definition, $E\in{\mathbb{R}}^{\tau\times n}$ , $\bm{f}\in{\mathbb{R}}^{\tau}$ , $\bm{u}\in{\mathbb{R}}^{n}$ , and $A^{k}\in{\mathbb{R}}^{m\times n}$ is a matrix with all elements equal to zero except one that is equal to one, i.e., if $A^{k}_{ji}=1$ for some $(i,j)\in N\times M$ , the bilinear constraint with index $k$ represents $y_{j}x_{i}=z_{k}$ .

The contributions of this paper are as follows. We propose a systematic procedure to convexify $\mathcal{S}$ and derive explicit inequalities in its description. The resulting cutting planes are directly obtained in the original space of variables. We show that facet-defining inequalities in the convex hull description can be explicitly derived by identifying special tree and forest structures in the underlying network, leading to an interpretable and efficient cut-generating oracle. The inequalities obtained from our proposed algorithms can be added to the typical McCormick relaxations to strengthen the formulation and improve the bounds. The presented methods consider a general network structure, which complement and generalize the results of [8] that are obtained for network interdiction problems. In particular, [8] presents the convexification results for a special case of $\mathcal{S}$ where $m=1$ , $\kappa=1$ and $\Xi$ is a dual network polytope. In this work, we extend these results by considering the cases where $M$ and $K$ can have multiple elements and $\Xi$ is a primal network polytope.

The remainder of the paper is organized as follows. We give a brief background on the EC&R procedure as a basis of our analysis in Section 2. In Section 3, we obtain convexification results for bilinear terms defined over a network polytope. Preliminary computational experiments are presented in Section 4 to show the effectiveness of the developed cut-generating frameworks. We give concluding remarks in Section 5.

Notation. Bold letters represent vectors. For a given set $S\subseteq{\mathbb{R}}^{n}$ , we denote by $\mathop{\rm conv}(S)$ its convex hull. We use symbol $\pm$ to show both cases with $+$ and $-$ . For example, when we use $l^{\pm}$ in an expression, it means that expression holds for both cases $l^{+}$ and $l^{-}$ .

2 Extended Cancel-and-Relax

In this section, we present the EC&R procedure adapted for $\mathcal{S}$ . The following theorem is at the core of this procedure, as shown in Theorem 2.7 of [8].

Theorem 2.1

A convex hull description of $\mathcal{S}$ can be obtained by the linear constraints in $\Xi$ and $\Delta_{m}$ together will all class- $l^{\pm}$ EC&R inequalities for all $l\in K$ . \Halmos

The procedure to generate a class- $l^{\pm}$ EC&R inequality is as follows.

We select $l\in K$ to be the index of a bilinear constraint used in the aggregation, which we refer to as the base equality. We also select a sign indicator $+$ or $-$ to indicate whether a weight $1$ or $-1$ is used for the base equality during aggregation. 2. 2.

We select $\mathcal{L}$ and $\bar{\mathcal{L}}$ as disjoint subsets of $K\setminus\{l\}$ . Then, for each $k\in\mathcal{L}$ (resp. $k\in\bar{\mathcal{L}}$ ), we multiply $\bm{y}^{\intercal}A^{k}\bm{x}-z_{k}=0$ by $\beta^{+}_{k}$ (resp. $-\beta^{-}_{k}$ ), where $\beta^{+}_{k}\geq 0$ (resp. $\beta^{-}_{k}\geq 0$ ). 3. 3.

Defining $T$ as the index set of the non-bound constraints in $\Xi$ , we select $\mathcal{I}_{1}$ , $\cdots$ , $\mathcal{I}_{m}$ and $\bar{\mathcal{I}}$ as subsets of $T$ whose intersection is empty. Then, for each $j\in M$ and for each $t\in\mathcal{I}_{j}$ (resp. $t\in\bar{\mathcal{I}}$ ), we multiply the constraint $E_{t.}\bm{x}\geq f_{t}$ by $\gamma^{j}_{t}y_{j}$ where $\gamma^{j}_{t}\geq 0$ (resp. by $\theta_{t}(1-\sum_{i\in M}y_{i})$ where $\theta_{t}\geq 0$ ). 4. 4.

We select $\mathcal{J}$ and $\bar{\mathcal{J}}$ as disjoint subsets of $N$ . Then, for each index $i\in\mathcal{J}$ , we multiply $x_{i}\geq 0$ by $\lambda_{i}(1-\sum_{j\in M}y_{j})$ where $\lambda_{i}\geq 0$ , and for each $i\in\bar{\mathcal{J}}$ , we multiply $u_{i}-x_{i}\geq 0$ by $\mu_{i}(1-\sum_{j\in M}y_{j})$ where $\mu_{i}\geq 0$ .

The above sets are compactly represented as $\big{[}\mathcal{L},\bar{\mathcal{L}}\big{|}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ , which is called an EC&R assignment. Each EC&R assignment is identified by its class- $l^{\pm}$ where $l$ is the index of the base equality and $\pm$ is its sign indicator. We next aggregate all aforementioned weighted constraints. During the aggregation, we require that weights $\beta$ , $\gamma$ , $\theta$ , $\lambda$ and $\mu$ be chosen in such a way that:

(C1)

at least $|\mathcal{L}|+|\bar{\mathcal{L}}|+\sum_{j\in M}|\mathcal{I}_{j}|+|\bar{\mathcal{I}}|+|\mathcal{J}|+|\bar{\mathcal{J}}|$ bilinear terms are canceled (i.e., their coefficient becomes zero), and

(C2)

if $\mathcal{L}\cup\bar{\mathcal{L}}\cup_{j\in M}\mathcal{I}_{j}\cup\bar{\mathcal{I}}\cup\mathcal{J}\cup\bar{\mathcal{J}}\neq\emptyset$ , for each constraint used in the aggregation (including the base equality), at least one bilinear term among all those created after multiplying that constraint with its corresponding weight is canceled.

The desired EC&R inequality is then obtained by relaxing (i.e., replacing) the remaining bilinear terms $x_{i}y_{j}$ in the aggregated inequality using either $x_{i}y_{j}\geq 0$ or $u_{i}y_{j}-x_{i}y_{j}\geq 0$ , depending on the sign of their coefficients. The resulting linear inequality is referred to as a class- $l^{\pm}$ EC&R inequality.

Next, we present a summary of the derivation of the EC&R procedure which will be used in subsequent sections; we refer the reader to [8] for a detailed account. It can be shown that $\bm{y}$ components in the extreme points of $\mathcal{S}$ are binary-valued. As a result, the convex hull of $\mathcal{S}$ can be obtained as a disjunctive union of a finite number of polytopes, each fixing $\bm{y}$ at an extreme point of the simplex $\Delta_{m}$ . A description for this convex hull can be obtained in a higher-dimensional space using the reformulation-linearization technique [18, 19] or disjunctive programming [2] through addition of new variables, as shown below.

[TABLE]

In the above description, variables $\bm{w}^{j}$ and $v^{j}_{k}$ can be viewed as $y_{j}\bm{x}$ and $y_{j}z_{k}$ , respectively, and the equalities are formulated as pairs of inequalities of opposite directions. Because the convex hull description in (1) contains additional variables, we can use polyhedral projection to obtain a convex hull description in the space of original variables.

Define the dual variables associated with the constraints of (1) by $\bm{\alpha}^{j\pm},\bm{\beta}^{\pm}\in{\mathbb{R}}^{\kappa}_{+}$ , $\bm{\gamma}^{j},\bm{\theta}\in{\mathbb{R}}^{\tau}_{+}$ , and $\bm{\eta}^{j},\bm{\rho}^{j},\bm{\lambda},\bm{\mu}\in{\mathbb{R}}^{n}_{+}$ , respectively. It follows from Proposition 2.3 of [8] that the collection of inequalities

[TABLE]

where

[TABLE]

for all extreme rays $\bm{\pi}=\left(\bm{\beta}^{+};\bm{\beta}^{-};\{\bm{\gamma}^{j}\}_{j\in M};\bm{\theta};\{\bm{\eta}^{j}\}_{j\in M};\{\bm{\rho}^{j}\}_{j\in M};\bm{\lambda};\bm{\mu}\right)$ of the projection cone

[TABLE]

contains all non-trivial facet-defining inequalities in the convex hull description of $\mathcal{S}$ . A non-trivial inequality is one that cannot be implied by the linear constraints in the description of $\Xi$ and $\Delta_{m}$ . It is easy to verify that an extreme ray of $\mathcal{C}$ that has components $\beta^{\pm}_{k}=0$ for all $k\in K$ leads to a trivial inequality. Therefore, we may assume that $\beta_{l}^{\pm}=1$ for some $l\in K$ in an extreme ray associated with a non-trivial facet-defining inequality of $\mathop{\rm conv}(\mathcal{S})$ through proper scaling. As a result, the search for the extreme rays of $\mathcal{C}$ reduces to that of the extreme points of the restriction set $\mathcal{C}^{l}$ for all $l\in K$ , where

[TABLE]

where $K_{l}=K\setminus\{l\}$ , and $\bm{\pi}^{l}$ is defined similarly to $\bm{\pi}$ , but without elements $\beta^{+}_{l}$ and $\beta^{-}_{l}$ .

The components in the dual vector $\bm{\pi}^{l}$ can be interpreted as the weights used in the EC&R procedure as follows. The fixing of the component $\beta^{\pm}_{l}$ is achieved in Step 1 of the EC&R procedure by picking a base equality $l$ with either $+1$ or $-1$ weights. The components $\bm{\beta}^{\pm}$ represent the weights of the bilinear constraints in $K_{l}$ as described in Step 2 of the EC&R procedure. The components $\bm{\gamma}^{j}$ (resp. $\bm{\theta}$ ) can be viewed as the weights for $y_{j}$ (resp. $1-\sum_{j\in M}y_{j}$ ) when multiplied with the non-bound constraints in $\Xi$ as demonstrated in Step 3 of the EC&R procedure. Similarly, $\bm{\lambda}$ and $\bm{\mu}$ denote the weights for $1-\sum_{j\in M}y_{j}$ when multiplied with the bound constraints in $\Xi$ in Step 4 of the EC&R procedure. Finally, the relaxation step in the EC&R procedure will use the components $\bm{\eta}^{j}$ and $\bm{\rho}^{j}$ as the weights for $y_{j}$ when multiplied with the bound constraints in $\Xi$ to cancel the remaining bilinear terms. It can be shown that criteria (C1) and (C2) of the EC&R procedure provide necessary conditions for the selected weight vector to be an extreme point of $\mathcal{C}^{l}$ . The resulting EC&R inequality is of the form (2), and the collection of all such inequalities contain all non-trivial facet-defining inequalities in $\mathop{\rm conv}(\mathcal{S})$ .

3 Network Polytopes

In this section, we study the set $\mathcal{S}$ where $\Xi$ represents a network polytope. In particular, the constraint set $E\bm{x}\geq\bm{f}$ is composed of the flow-balance constraints after being separated into two inequalities of opposite signs, i.e., $E$ is an augmented node-arc incidence matrix where each row is duplicated with a negative sign, and $\bm{f}$ represents the extended supply/demand vector. In this description, $\bm{u}$ denotes the arc-capacity vector. First, we show that the representation of the EC&R assignment can be simplified for set $\mathcal{S}$ because of the special structure of the bilinear constraints in its description.

Remark 3.1

In $\mathcal{S}$ , each bilinear constraint has a single bilinear term that does not appear in any other bilinear constraints. In particular, for each $k\in K$ , the set contains the bilinear constraint $y_{j}x_{i}-z_{k}=0$ for some $(i,j)\in N\times M$ such that $A^{k}_{ji}=1$ . As a result, we can skip Step 2 in the EC&R procedure and merge it into the relaxation step, in which the bound constraints on variables are multiplied with $y_{j}$ to relax any remaining bilinear term in the aggregated inequality. As such, we may remove the sets $\mathcal{L}$ and $\bar{\mathcal{L}}$ from the EC&R assignment to reformat it as $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ , and perform the aggregation on the constraints in this reduced assignment to satisfy conditions (C1) and (C2), which are adjusted accordingly by dropping $\mathcal{L}$ and $\bar{\mathcal{L}}$ in their statements. In the resulting aggregated inequality, any remaining bilinear term of the form $y_{j}x_{i}$ can be relaxed using either $-y_{j}x_{i}+u_{i}y_{j}\geq 0$ corresponding to the dual weight $\rho^{j}_{i}$ or $-y_{j}x_{i}+z_{k}\geq 0$ with $k\in K$ such that $A^{k}_{ji}=1$ corresponding to the dual weight $\beta^{-}_{k}$ . Similarly, we can relax $-y_{j}x_{i}$ using either $y_{j}x_{i}\geq 0$ corresponding to the dual weight $\eta^{j}_{i}$ or $y_{j}x_{i}-z_{k}\geq 0$ with $k\in K$ such that $A^{k}_{ji}=1$ corresponding to the dual weight $\beta^{+}_{k}$ .

3.1 The Case with $m=1$ .

In this section, we consider the case where $m=1$ , whose corresponding bilinear set is denoted by $\mathcal{S}^{1}$ . In this case, we can simplify notation by matching the indices of $\bm{z}$ and $\bm{x}$ variables such that $y_{1}x_{k}=z_{k}$ for all $k\in K=N$ . We next show that, to generate an EC&R inequality for $\mathcal{S}^{1}$ , it is sufficient to use aggregation weight $1$ for all constraints used in the aggregation.

Proposition 3.2

Let $\bar{\bm{\pi}}^{l}$ be an extreme point of the projection cone $\mathcal{C}^{l}$ , for some $l\in K$ , corresponding to a non-trivial facet-defining inequality of $\mathop{\rm conv}(\mathcal{S}^{1})$ . Then, it can be scaled in such a way that all of its components are 0 or 1.

Proof 3.3

Proof. When $m=1$ , we can write $\mathcal{C}^{l}$ in (3) as

[TABLE]

We can rearrange the columns of the coefficient matrix of the system defining $\mathcal{C}^{l}$ to obtain

[TABLE]

In the above matrix, the rows correspond to bilinear terms $y_{1}x_{i}$ (i.e., $w^{1}_{i}$ in the disjunctive programming formulation (1)) for $i\in N$ . The first and second column blocks correspond to the weights of the non-bound constraints in $\Xi$ multiplied by $y_{1}$ and $1-y_{1}$ , which are denoted by $\gamma^{1}_{t}$ and $\theta_{t}$ , respectively, for all $t\in T$ . The third and fourth column blocks correspond to the weights of the lower and upper bound constraints on variables in $\Xi$ multiplied by $1-y_{1}$ , which are captured by $\mu_{i}$ and $\lambda_{i}$ , respectively, for all $i\in N$ . Similarly, the fifth and sixth column blocks correspond to the weights of the lower and upper bound constraints on variables in $\Xi$ multiplied by $y_{1}$ , which are recorded by $\eta^{1}_{i}$ and $\rho^{1}_{i}$ , respectively, for all $i\in N$ . In these columns, $I$ represents the identity matrix of appropriate size. Lastly, the seventh and eighth column blocks correspond to the weights of the bilinear constraints in $\mathcal{S}^{1}$ , which are represented by $\beta^{+}_{k}$ and $\beta^{-}_{k}$ , respectively, for all $k\in K_{l}$ . In particular, the element at column $k\in K_{l}$ and row $i\in N$ of $\bar{I}$ is equal to $1$ if $i=k$ , and it is equal to zero otherwise. Based on these column values, it can be easily verified that (4) is totally unimodular (TU). In $\mathcal{C}^{l}$ , the right-hand-side vector is $\pm\bm{e}^{l}\in{\mathbb{R}}^{m+n}$ , where $\bm{e}^{l}$ is the unit vector whose components are all zero except for that corresponding to row $l$ representing $y_{1}x_{l}$ , which is equal to $1$ . Because $\bar{\bm{\pi}}^{l}$ is an extreme point of $\mathcal{C}^{l}$ , it is associated with a basic feasible solution for its system of equations. Let $B$ be the corresponding basis for (4). It follows from Cramer’s rule that all elements of $B^{-1}$ belong to $\{0,-1,1\}$ since (4) is TU. Therefore, the components of $\pm B^{-1}\bm{e}^{l}$ belong to $\{0,-1,1\}$ . We conclude that the components of basic feasible solutions to $\mathcal{C}^{l}$ are equal to [math] or $1$ due to non-negativity of all variables in its description. \Halmos

Remark 3.4

When $m=1$ , multiplying the bound constraints with $1-y_{1}$ in Step 3 of the EC&R procedure produces two of the standard McCormick bounds. As a result, we can skip Step 3 in the EC&R procedure and merge it into the relaxation step, in which the other two McCormick bounds are used for relaxing the remaining bilinear terms. Considering Remark 3.1, any remaining bilinear term in the aggregated inequality can be relaxed into either of the two McCormick lower bounds or the two McCormick upper bounds or the $\pm z$ variable corresponding to that term depending on its sign. In this case, the characterization of EC&R assignment can be reduced further to $\big{[}\mathcal{I}_{1},\bar{\mathcal{I}}\big{]}$ .

Remark 3.5

As described in Remark 3.4, each remaining bilinear term in the aggregated inequality of the EC&R procedure can be relaxed into three different linear terms. While this can lead to an exponential growth in the number of resulting linear EC&R inequalities for each EC&R assignment, we can use an efficient separation procedure to find the most violated inequality among the resulting EC&R inequalities as follows. Assume that we aim to separate a given solution $(\bar{\bm{x}};\bar{y}_{1};\bar{\bm{z}})$ from $\mathop{\rm conv}(\mathcal{S}^{1})$ through the EC&R inequalities obtained from the aggregated inequality $g(\bar{\bm{x}};y_{1};\bar{\bm{z}})\geq 0$ associated with the EC&R assignment $\big{[}\mathcal{I}_{1},\bar{\mathcal{I}}\big{]}$ . For each bilinear term $y_{1}x_{i}$ , we choose the relaxation option that provides the minimum value among $u_{i}\bar{y}_{1}$ obtained from using $y_{1}(u_{i}-x_{i})\geq 0$ , $\bar{x}_{i}$ obtained from using $(1-y_{1})x_{i}\geq 0$ , and $\bar{z}_{i}$ obtained from using $-y_{1}x_{i}+z_{i}\geq 0$ . Similarly, for each bilinear term $-y_{1}x_{i}$ , we choose the relaxation option that provides the minimum value among [math] obtained from using $y_{1}x_{i}\geq 0$ , $u_{i}-\bar{x}_{i}-u_{i}\bar{y}_{1}$ obtained from using $(1-y_{1})(u_{i}-x_{i})\geq 0$ , and $-\bar{z}_{i}$ obtained from using $y_{1}x_{i}-z_{i}\geq 0$ . This approach provides the most violated EC&R inequality in the time linear with the number of remaining bilinear terms in the aggregated inequality.

Considering the relation between the extreme points of the projection cone $\mathcal{C}^{l}$ for $l\in K$ and the aggregation weights in the EC&R procedure, Proposition 3.2 and Remark 3.4 imply that generating class- $l^{\pm}$ EC&R inequalities reduces to identifying the assignment $\big{[}\mathcal{I}_{1},\bar{\mathcal{I}}\big{]}$ as the aggregation weights are readily determined. In particular, the constraints in $\mathcal{I}_{1}$ are multiplied with $y_{1}$ , and those in $\bar{\mathcal{I}}$ are multiplied with $1-y_{1}$ . We next show that, for set $\mathcal{S}^{1}$ , identifying all the EC&R assignments that satisfy the EC&R conditions (C1) and (C2) can be achieved by considering a special graphical structure in the underlying network.

Given a network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ with a node set $\mathrm{V}$ and arc set $\mathrm{A}$ , assume that the index $k$ of variables $z_{k}$ in the description of $\mathcal{S}^{1}$ refers to the arc whose flow variable $x_{k}$ appears in that bilinear constraint, i.e., $y_{1}x_{k}=z_{k}$ for $k\in\mathrm{A}=N=K$ . We define $t(k)$ and $h(k)$ to be the tail and head nodes of arc $k\in\mathrm{A}$ , respectively. Further, for any node $i\in\mathrm{V}$ , we define $\delta^{+}(i)$ and $\delta^{-}(i)$ to be the set of outgoing and incoming arcs at that node, respectively. We refer to the flow-balance inequality $\sum_{k\in\delta^{+}(i)}x_{k}-\sum_{k\in\delta^{-}(i)}x_{k}\geq f_{i}$ (resp. $-\sum_{k\in\delta^{+}(i)}x_{k}+\sum_{k\in\delta^{-}(i)}x_{k}\geq-f_{i}$ ) corresponding to node $i$ as the positive (resp. negative) flow-balance inequality, and refer to its index in the description of $\Xi$ by $i^{+}$ (resp. $i^{-}$ ) to be recorded in the EC&R assignment. For example, an EC&R assignment $\big{[}\{i^{+}\},\{j^{-}\}\big{]}$ implies that, in the aggregation, the positive flow-balance inequality corresponding to the node $i\in\mathrm{V}$ is multiplied with $y_{1}$ , and the negative flow-balance inequality corresponding to the node $j\in\mathrm{V}$ is multiplied with $1-y_{1}$ . In the sequel, we refer to the undirected variant of a subnetwork $\mathrm{P}$ of $\mathrm{G}$ with $\bar{\mathrm{P}}$ , and conversely, we refer to the the directed variant of an undirected subnetwork $\bar{\mathrm{P}}$ of $\bar{\mathrm{G}}$ with $\mathrm{P}$ according to the arc directions in $\mathrm{G}$ .

Proposition 3.6

Consider set $\mathcal{S}^{1}$ with $\Xi$ that represents the network polytope corresponding to the network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . Let $\big{[}\mathcal{I}_{1},\bar{\mathcal{I}}\big{]}$ be an EC&R assignment for class- $l^{\pm}$ , for some $l\in\mathrm{A}$ , that leads to a non-trivial facet-defining inequality of $\mathop{\rm conv}(\mathcal{S}^{1})$ . Define $\widetilde{\mathcal{I}}=\{i\in\mathrm{V}|i^{\pm}\in\mathcal{I}_{1}\cup\bar{\mathcal{I}}\}$ to be the subset of nodes whose flow-balance inequalities are used in the aggregation. Then, there exists a tree $\bar{\mathrm{T}}$ of $\bar{\mathrm{G}}$ composed of the nodes in $\widetilde{\mathcal{I}}$ such that arc $l$ is incident to exactly one node of $\bar{\mathrm{T}}$ .

Proof 3.7

Proof. First, we observe that for each node $i\in\mathrm{V}$ , both of its positive and negative flow-balance inequalities cannot be selected for the aggregation, since otherwise, the columns representing the positive and negative inequalities in the basis of the coefficient matrix (4) associated with the extreme point of $\mathcal{C}^{l}$ would be linearly dependent, which would be a contradiction to the fact that the selected EC&R assignment leads to a facet-defining inequality of $\mathop{\rm conv}(\mathcal{S}^{1})$ ; see the proof of Proposition 3.2 for details. As a result, considering that $\mathcal{I}_{1}\cap\bar{\mathcal{I}}=\emptyset$ by the EC&R requirement, at most one of the following possibilities can occur in the EC&R assignment: $i^{+}\in\mathcal{I}_{1}$ , $i^{+}\in\bar{\mathcal{I}}$ , $i^{-}\in\mathcal{I}_{1}$ , and $i^{-}\in\bar{\mathcal{I}}$ . Therefore, each node in $\widetilde{\mathcal{I}}$ corresponds to a unique flow-balance constraint in the EC&R assignment. Next, we show that arc $l$ is incident to exactly one node of $\widetilde{\mathcal{I}}$ . It follows from condition (C2) of the EC&R procedure that the bilinear term $y_{1}x_{l}$ for arc $l$ in the base equality must be canceled during the aggregation. The constraints of $\Xi$ that can produce the bilinear term $y_{1}x_{l}$ during the aggregation are the flow-balance constraint corresponding to the tail node $t(l)$ of arc $l$ , and the flow-balance constraint corresponding to the head node $h(l)$ of arc $l$ . Since the aggregation weight for all the constraints in the EC&R assignment are $1$ according to Proposition 3.2, and considering that each flow-balance constraint can appear once in the aggregation as noted above, the only possibility to cancel the term $y_{1}x_{l}$ is to pick exactly one of the above constraints in the EC&R assignment. As a result, exactly one of the head and the tail nodes of arc $l$ must be in $\widetilde{\mathcal{I}}$ . Next, we show that there exists a tree $\bar{\mathrm{T}}$ of $\bar{\mathrm{G}}$ whose node set is $\widetilde{\mathcal{I}}$ . Assume by contradiction that there is no such tree composed of the nodes in $\widetilde{\mathcal{I}}$ . Therefore, $\widetilde{\mathcal{I}}$ can be partitioned into two subsets $\widetilde{\mathcal{I}}_{1}$ and $\widetilde{\mathcal{I}}_{2}$ , where the nodes in $\widetilde{\mathcal{I}}_{1}$ are not adjacent to any nodes in $\widetilde{\mathcal{I}}_{2}$ . It is clear that arc $l$ cannot be incident to the nodes in both $\widetilde{\mathcal{I}}_{1}$ and $\widetilde{\mathcal{I}}_{2}$ , since otherwise $\widetilde{\mathcal{I}}_{1}$ and $\widetilde{\mathcal{I}}_{2}$ would have adjacent nodes. Assume without the loss of generality that arc $l$ is incident to a node in $\widetilde{\mathcal{I}}_{1}$ . Since the given EC&R assignment leads to a facet-defining inequality after applying the relaxation step, its aggregation weights correspond to an extreme point of $\mathcal{C}^{l}$ as descried in the proof of Proposition 3.2. The resulting system of equations for the associated basic feasible solution can be written as

[TABLE]

where the columns and rows of the basis matrix have been suitably reordered. In (5), the first (resp. second) row block corresponds to bilinear terms $y_{1}x_{i}$ for arcs $i\in\mathrm{A}$ that are incident to the nodes in $\widetilde{\mathcal{I}}_{1}$ (resp. $\widetilde{\mathcal{I}}_{2}$ ), and the last row block corresponds to all the other bilinear terms that do not appear during aggregation. The first (resp. fourth) column block denotes the transpose of the node-arc incidence matrix for nodes in $\widetilde{\mathcal{I}}_{1}$ (resp. $\widetilde{\mathcal{I}}_{2}$ ). The second (resp. fifth) column block contains positive or negative multiples of columns of the identity matrix representing the weights used in the relaxation step of the EC&R procedure corresponding to the McCormick bounds. The third (resp. sixth) column block represents positive or negative multiples of the bilinear constraints in the description of $\mathcal{S}^{1}$ used in the relaxation step corresponding to the arcs appearing in the first (resp. second) row blocks. All these columns have weights equal to 1 according to Proposition 3.2 as denoted in the first two row blocks of the solution vector multiplied with this matrix. The last column in the basis corresponds to the constraints that have weights [math] in the basic feasible solution and are added to complete the basis. Lastly, $\bm{e}^{l}$ is a unit vector whose elements are all zeros except that corresponding to $y_{1}x_{l}$ , which is equal to $1$ . It is now easy to verify that the linear combination of the columns in the column blocks 4, 5 and 6 of the basis matrix with weights $1$ yields the zero vector. This shows that these columns are linearly dependent, a contradiction. \Halmos

Proposition 3.6 implies that each non-trivial EC&R inequality can be obtained as an aggregation of constraints corresponding to a special tree structures. The next theorem provides the converse result that aggregating constraints associated with each special tree structure can produce EC&R inequalities.

Theorem 3.8

Consider set $\mathcal{S}^{1}$ with $\Xi$ that represents the network polytope corresponding to the network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . Let $\bar{\mathrm{T}}$ be a tree in $\bar{\mathrm{G}}$ with the node set $\widetilde{\mathcal{I}}\subseteq\mathrm{V}$ , and let $l\in\mathrm{A}$ be an arc that is incident to exactly one node of $\bar{\mathrm{T}}$ . Then, for any partition $\widetilde{\mathcal{I}}_{1}$ and $\widetilde{\mathcal{I}}_{2}$ of $\widetilde{\mathcal{I}}$ (i.e., $\widetilde{\mathcal{I}}_{1}\cap\widetilde{\mathcal{I}}_{2}=\emptyset$ and $\widetilde{\mathcal{I}}_{1}\cup\widetilde{\mathcal{I}}_{2}=\widetilde{\mathcal{I}}$ ), we have that

(i)

if $h(l)\in\widetilde{\mathcal{I}}$ , then $\big{[}\{i^{+}\}_{i\in\widetilde{\mathcal{I}}_{1}},\{i^{-}\}_{i\in\widetilde{\mathcal{I}}_{2}}\big{]}$ is an EC&R assignment for class- $l^{+}$ ,

(ii)

if $h(l)\in\widetilde{\mathcal{I}}$ , then $\big{[}\{i^{-}\}_{i\in\widetilde{\mathcal{I}}_{1}},\{i^{+}\}_{i\in\widetilde{\mathcal{I}}_{2}}\big{]}$ is an EC&R assignment for class- $l^{-}$ ,

(iii)

if $t(l)\in\widetilde{\mathcal{I}}$ , then $\big{[}\{i^{-}\}_{i\in\widetilde{\mathcal{I}}_{1}},\{i^{+}\}_{i\in\widetilde{\mathcal{I}}_{2}}\big{]}$ is an EC&R assignment for class- $l^{+}$ ,

(iv)

if $t(l)\in\widetilde{\mathcal{I}}$ , then $\big{[}\{i^{+}\}_{i\in\widetilde{\mathcal{I}}_{1}},\{i^{-}\}_{i\in\widetilde{\mathcal{I}}_{2}}\big{]}$ is an EC&R assignment for class- $l^{-}$ .

Proof 3.9

Proof. We show the result for part (i), as the proof for parts (ii)–(iv) follows from similar arguments. It suffices to show that the aggregation procedure performed on the constraints in the proposed assignment satisfies the EC&R conditions (C1) and (C2). For condition (C1), we need to show that at least $|\widetilde{\mathcal{I}}_{1}|+|\widetilde{\mathcal{I}}_{2}|$ bilinear terms are canceled during aggregation. Let $R$ be the set of arcs in $\mathrm{T}$ , which is the directed variant of $\bar{\mathrm{T}}$ obtained by replacing each edge with its corresponding arc in $\mathrm{G}$ . It is clear from the definition that $l\notin R$ . As a result, for each $r\in R$ , the only constraints in the aggregation that contain $x_{r}$ are the flow-balance constraints corresponding to the head node $h(r)$ and tail node $t(r)$ of $r$ since both of these nodes are included in $\widetilde{\mathcal{I}}$ as $r$ is an arc in $\mathrm{T}$ . There are four cases for the partitions of $\widetilde{\mathcal{I}}$ that these head and tail nodes can belong to. For the first case, assume that $h(r)\in\widetilde{\mathcal{I}}_{1}$ and $t(r)\in\widetilde{\mathcal{I}}_{1}$ . It follows from the EC&R assignment in case (i) that the positive flow-balance constraints $h(r)^{+}$ and $t(r)^{+}$ are used in the aggregation with weights $y_{1}$ . In particular, we have $y_{1}\big{(}\sum_{k\in\delta^{+}(h(r))}x_{k}-\sum_{k\in\delta^{-}(h(r))\setminus\{r\}}x_{k}-x_{r}\geq f_{(h(r))}\big{)}$ added with $y_{1}\big{(}\sum_{k\in\delta^{+}(t(r))\setminus\{r\}}x_{k}-\sum_{k\in\delta^{-}(t(r))}x_{k}+x_{r}\geq f_{(t(r))}\big{)}$ , which results in the cancellation of $y_{1}x_{r}$ . For the second case, assume that $h(r)\in\widetilde{\mathcal{I}}_{1}$ and $t(r)\in\widetilde{\mathcal{I}}_{2}$ . It follows from the EC&R assignment that the positive flow-balance constraints $h(r)^{+}$ and the negative flow-balance constraint $t(r)^{+}$ are used in the aggregation with weights $y_{1}$ and $(1-y_{1})$ , respectively. In particular, we have $y_{1}\big{(}\sum_{k\in\delta^{+}(h(r))}x_{k}-\sum_{k\in\delta^{-}(h(r))\setminus\{r\}}x_{k}-x_{r}\geq f_{(h(r))}\big{)}$ added with $(1-y_{1})\big{(}-\sum_{k\in\delta^{+}(t(r))\setminus\{r\}}x_{k}+\sum_{k\in\delta^{-}(t(r))}x_{k}-x_{r}\geq-f_{(t(r))}\big{)}$ , which results in the cancellation of $y_{1}x_{r}$ . For the remaining two cases, we can use similar arguments by changing the inequality signs to show that the term $y_{1}x_{r}$ will be canceled during aggregation. As a result, we obtain at least $|R|$ cancellations during the aggregation corresponding to the arcs of $\mathrm{T}$ . Since $\bar{\mathrm{T}}$ is a tree, we have that $|R|=|\widetilde{\mathcal{I}}_{1}|+|\widetilde{\mathcal{I}}_{2}|-1$ . Finally, for arc $l$ , it follows from the assumption of case (i) in the problem statement that $h(l)\in\widetilde{\mathcal{I}}$ and $t(l)\notin\widetilde{\mathcal{I}}$ . If $h(l)\in\widetilde{\mathcal{I}}_{1}$ , then according to the EC&R procedure for class- $l^{+}$ , we aggregate $y_{1}x_{l}-z_{l}=0$ with $y_{1}\big{(}\sum_{k\in\delta^{+}(h(l))}x_{k}-\sum_{k\in\delta^{-}(h(l))\setminus\{r\}}x_{k}-x_{l}\geq f_{(h(l))}\big{)}$ , which results in the cancellation of $y_{1}x_{l}$ . If $h(l)\in\widetilde{\mathcal{I}}_{2}$ , we aggregate $y_{1}x_{l}-z_{l}=0$ with $(1-y_{1})\big{(}-\sum_{k\in\delta^{+}(h(l))}x_{k}+\sum_{k\in\delta^{-}(h(l))\setminus\{r\}}x_{k}+x_{l}\geq-f_{(h(l))}\big{)}$ , which results in the cancellation of $y_{1}x_{l}$ . As a result, in total we have at least $|R|+1=|\widetilde{\mathcal{I}}_{1}|+|\widetilde{\mathcal{I}}_{2}|$ cancellations during the aggregation of the constraints in the EC&R assignment, showing the satisfaction of condition (C1) of the EC&R procedure. For condition (C2) of the EC&R procedure, we need to show that for each constraint used in the aggregation, including the base equality, at least one bilinear term among those created after multiplication of that constraint with its corresponding weight is canceled. There are two types of constraints to consider. The first type is the flow-balance constraints in $\widetilde{\mathcal{I}}_{1}$ and $\widetilde{\mathcal{I}}_{2}$ , which correspond to the nodes of $\bar{\mathrm{T}}$ . It follows from the previous discussion that for each node $i\in\widetilde{\mathcal{I}}_{1}\cup\widetilde{\mathcal{I}}_{2}$ , the bilinear term $y_{1}x_{r}$ , where $r$ is an arc in $\mathrm{T}$ that is incident to $i$ , i.e., $h(r)=i$ or $t(r)=i$ , is canceled during aggregation. This proves that at least one bilinear term is canceled in the inequality obtained after multiplying the corresponding flow-balance constraint at node $i$ with $y_{1}$ or $1-y_{1}$ . The second type of constraints used in the aggregation is the base equality $l$ . The proof follows from an argument similar to that given above where we showed that the bilinear term $y_{1}x_{l}$ that appears in the base constraint $y_{1}x_{l}-z_{l}=0$ is canceled. We conclude that condition (C2) of the EC&R procedure is satisfied for all constraints used in the aggregation. \Halmos

In view of Theorem 3.8, note that for the most basic choice of the tree $\bar{\mathrm{T}}$ , i.e., an empty set, the resulting EC&R inequalities recover the classical McCormick bounds. Therefore, considering any nonempty tree structure can potentially improve the McCormick results by adding new valid inequalities for the bilinear set.

Proposition 3.6 and Theorem 3.8 suggest that the EC&R assignments have a simple graphical interpretation for $\mathcal{S}^{1}$ , which can be used to generate all non-trivial EC&R inequalities to describe $\mathop{\rm conv}(\mathcal{S}^{1})$ without the need to search for all possible constraints and their aggregation weights that satisfy the EC&R conditions as is common for general bilinear sets. This attribute can significantly mitigate cut-generation efforts when used systematically to produce cutting plane. Such a systematic procedure can be designed by identifying tress of a given network and then following the result of Theorem 3.8 to obtain the corresponding EC&R assignments. We illustrate this method in the following example.

Example 3.10

Consider set $\mathcal{S}^{1}$ where $\Xi$ represents the network model corresponding to a spiked cycle graph $\mathrm{G}=(\mathrm{V},\mathrm{A})$ shown in Figure 1. We refer to each arc in this network as a pair $(i,j)$ of its tail node $i$ and its head node $j$ , and denote its corresponding flow variable as $x_{i,j}$ . Assume that we are interested in finding EC&R assignments for class- $(1,5)^{+}$ . According to Theorem 3.8, we need to identify the trees that contain exactly one of the tail and head nodes of arc $(1,5)$ . For instance, we may select the tree $\bar{\mathrm{T}}$ composed of the nodes $\widetilde{\mathcal{I}}=\{8,4,1,2,6\}$ that contain the tail node of arc $(1,5)$ . Consider the partitions $\widetilde{\mathcal{I}}_{1}=\{8,2\}$ and $\widetilde{\mathcal{I}}_{2}=\{4,1,6\}$ . Following case (iii) in Theorem 3.8, we can obtain the EC&R assignment $\big{[}\{8^{-},2^{-}\},\{4^{+},1^{+},6^{+}\}\big{]}$ for class- $(1,5)^{+}$ . As a result, we multiply the negative flow-balance constraints at nodes $8$ and $2$ with $y_{1}$ , and we multiply the positive flow-balance constraints at nodes $4$ , $1$ , and $6$ with $1-y_{1}$ , and aggregate them with the base bilinear equality corresponding to arc $(1,5)$ with weight 1 to obtain the aggregated inequality

[TABLE]

where $f_{i}$ denotes the supply/demand value at node $i$ . Following Remark 3.4, we may relax each remaining bilinear term $-y_{1}x_{2,3}$ and $-y_{1}x_{4,3}$ into three possible linear expressions, leading to 9 total EC&R inequalities. If implemented inside of a separation oracle, we can use Remark 3.5 to find the most violated inequality among these 9 efficiently.

Consider a generalization of $\mathcal{S}^{1}$ where the bilinear constraints may contain multiple bilinear terms:

[TABLE]

where $A^{k}$ is a matrix of appropriate size with potentially multiple nonzero elements. For instance, $\widetilde{\mathcal{S}}^{1}$ may include the constraint $2y_{1}x_{i}-5y_{1}x_{j}=z_{k}$ for some $i,j\in N$ . In this case, the coefficient matrix (4) of $\mathcal{C}^{l}$ will be modified as follows after rearranging columns and rows.

[TABLE]

In the above matrix, the row and column blocks are defined similarly to those of (4) with a difference that the seventh and eighth column blocks correspond to the weights of the bilinear constraints $y_{1}A^{k}\bm{x}=z_{k}$ , which are represented by $\beta^{+}_{k}$ and $\beta^{-}_{k}$ in the dual weight vector, respectively, for all $k\in K_{l}$ . In particular, the element at column $k\in K_{l}$ and row $i\in N$ of $\tilde{A}$ is equal to $A^{k}_{1i}$ . It is clear from the structure of (6) that this matrix may lose the TU property when $A^{k}$ contains multiple nonzero elements for some $k\in K_{l}$ . In fact, this property may not hold even if $A^{k}_{1i}\in\{0,1,-1\}$ for all $k\in K_{l}$ and $i\in N$ . As a result, there is no guarantee that the aggregation weights for all the EC&R inequalities corresponding to non-trivial facets of $\mathop{\rm conv}(\widetilde{\mathcal{S}}^{1})$ will be $1$ . While an explicit derivation of the convex hull description through identifying special network structures, such as those presented for $\mathcal{S}^{1}$ , may not be attainable for this problem in its original space of variables, we can use the following ancillary result to apply Theorem 3.8 and obtain a convex hull description for $\widetilde{\mathcal{S}}^{1}$ in a higher-dimensional space.

Proposition 3.11

Consider sets

[TABLE]

and

[TABLE]

Then,

[TABLE]

Proof 3.12

Proof. We prove the result by showing both directions of inclusion for the equality. The direct inclusion follows from the fact that the convex hull of intersection of two sets is a subset of the intersection of the convex hulls of those sets. For the reverse inclusion, we need to show that $\mathop{\rm conv}\left((\mathcal{S}^{1}\times{\mathbb{R}}^{\kappa})\cap\mathcal{D}\right)\supseteq\left(\mathop{\rm conv}(\mathcal{S}^{1})\times{\mathbb{R}}^{\kappa}\right)\cap\mathcal{D}$ . Consider a point $\bar{\phi}=(\bar{\bm{x}};\bar{y};\bar{\bm{w}};\bar{\bm{z}})\in\left(\mathop{\rm conv}\left(\mathcal{S}^{1}\right)\times{\mathbb{R}}^{\kappa}\right)\cap\mathcal{D}$ . We show that $\bar{\phi}\in\mathop{\rm conv}\left((\mathcal{S}^{1}\times{\mathbb{R}}^{\kappa})\cap\mathcal{D}\right)$ . It follows from the assumption that $\bar{z}_{k}=A^{k}\bar{\bm{w}}$ for all $k\in K$ . Further, there must exist a finite collection of points $\hat{\phi}^{j}=(\hat{\bm{x}}^{j};\hat{y}^{j};\hat{\bm{w}}^{j})\in\mathcal{S}^{1}$ for $j\in J$ such that $\bar{\bm{x}}=\sum_{j\in J}\omega_{j}\hat{\bm{x}}^{j}$ , $\bar{y}=\sum_{j\in J}\omega_{j}\hat{y}^{j}$ , and $\bar{\bm{w}}=\sum_{j\in J}\omega_{j}\hat{\bm{w}}^{j}$ for some non-negative weights $\omega_{j}$ such that $\sum_{j\in J}\omega_{j}=1$ . Consider the set of points $\dot{\phi}^{j}=(\dot{\bm{x}}^{j};\dot{y}^{j};\dot{\bm{w}}^{j};\dot{\bm{z}}^{j})$ for $j\in J$ such that $\dot{\bm{x}}^{j}=\hat{\bm{x}}^{j}$ , $\dot{y}^{j}=\hat{y}^{j}$ , $\dot{\bm{w}}^{j}=\hat{\bm{w}}^{j}$ , and $\dot{\bm{z}}^{j}_{k}=A^{k}\dot{\bm{w}}^{j}$ for all $k\in K$ . It is clear that $\dot{\phi}^{j}\in(\mathcal{S}^{1}\times{\mathbb{R}}^{\kappa})\cap\mathcal{D}$ for all $j\in J$ by definition of the components of these points. It follows that $\bar{\phi}=\sum_{j\in J}\omega_{j}\dot{\phi}^{j}$ , since $\bar{\bm{x}}=\sum_{j\in J}\omega_{j}\dot{\bm{x}}^{j}$ , $\bar{y}=\sum_{j\in J}\omega_{j}\dot{y}^{j}$ , and $\bar{\bm{w}}=\sum_{j\in J}\omega_{j}\dot{\bm{w}}^{j}$ by definition, and since $\bar{z}_{k}=A^{k}\bar{\bm{w}}=A^{k}(\sum_{j\in J}\omega_{j}\hat{\bm{w}}^{j})=\sum_{j\in J}\omega_{j}A^{k}\hat{\bm{w}}^{j}=\sum_{j\in J}\omega_{j}A^{k}\dot{\bm{w}}^{j}=\sum_{j\in J}\omega_{j}\dot{\bm{z}}^{j}_{k}$ for all $k\in K$ . This proves that $\bar{\phi}\in\mathop{\rm conv}\left((\mathcal{S}^{1}\times{\mathbb{R}}^{\kappa})\cap\mathcal{D}\right)$ . \Halmos

The result of Proposition 3.11 shows that we can obtain a convex hull description for $\widetilde{\mathcal{S}}^{1}$ in a higher dimension, which is expressed on the left-hand-side of (7), by finding the convex hull of $\mathcal{S}^{1}$ through application of Theorem 3.8 and then intersecting it with the linear constraints in $\mathcal{D}$ as indicated on the right-hand-side of (7).

3.2 The Case with $m>1$ .

In this section, we consider the general case where $m>1$ in $\mathcal{S}$ . The coefficient matrix of $\mathcal{C}^{l}$ in (3) can be written as follows after a suitable rearrangement of columns and rows.

[TABLE]

In the above matrix, each row block $j\in M$ represents the bilinear terms $y_{j}x_{i}$ (i.e., $w^{j}_{i}$ in the disjunctive programming formulation (1)) for all $i\in N$ . The first $m$ column blocks correspond to the weights of the flow-balance constraints in $\Xi$ multiplied by $y_{j}$ for $j\in M$ , which are denoted by $\gamma^{j}_{t}$ for all $t\in T$ in the dual vector. The next column block represents the weights of the flow-balance constraints in $\Xi$ multiplied by $1-\sum_{j\in M}y_{j}$ , which are denoted by $\theta_{t}$ for all $t\in T$ in the dual vector. The next two column blocks indicate the lower and upper bound constraints on variables in $\Xi$ multiplied by $1-\sum_{j\in M}y_{j}$ , which are recorded by $\lambda_{i}$ and $\mu_{i}$ , respectively, for all $i\in N$ . The next $2m$ columns blocks correspond to the weights of the lower and upper bound constraints on variables in $\Xi$ multiplied by $y_{j}$ for $j\in M$ , which are recorded by $\eta^{j}_{i}$ and $\rho^{j}_{i}$ , respectively, for all $i\in N$ . The last $2m$ column blocks correspond to the weights of the positive and negative bilinear constraints in $\mathcal{S}$ , which are represented by $\beta^{+}_{k}$ and $\beta^{-}_{k}$ for all $k\in K_{l}$ . For instance, for constraint $y_{j}x_{i}-z_{k}\geq 0$ with $(i,j)\in N\times M$ and $k\in K_{l}$ , the elements of column $k$ in $\bar{I}^{j}$ are all zero except the one in the row corresponding to the bilinear term $y_{j}x_{i}$ which is equal to one.

It is clear from the structure of (12) that this matrix does not have the TU property, implying that a result similar to that of Proposition 3.2 does not necessarily hold. Therefore, there is no guarantee that the aggregation weights for all the EC&R assignments obtained from the extreme points of $\mathcal{C}^{l}$ are $1$ . In fact, Example 3.13 shows that there exists EC&R assignments with aggregation weights that map to extreme points of $\mathcal{C}^{l}$ with components that are not all [math] or $1$ .

Example 3.13

Consider set $\mathcal{S}$ where $\Xi$ describes the network polytope corresponding to network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ in Figure 1, and $\Delta=\{(y_{1},y_{2})\in{\mathbb{R}}^{2}_{+}|x_{1}+x_{2}\leq 1\}$ . Select class- $l^{+}$ corresponding to the base equality $y_{1}x_{8,4}-z_{l}=0$ . Let $\big{[}\mathcal{I}_{1},\mathcal{I}_{2},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ be an assignment for class- $l^{+}$ where $\mathcal{I}_{1}=\{4^{+},3^{+}\}$ , $\mathcal{I}_{2}=\{2^{-}\}$ , $\bar{\mathcal{I}}=\{1^{+}\}$ , $\mathcal{J}=\{(4,1)\}$ , and $\bar{\mathcal{J}}=\{(2,3)\}$ . Next, we show that the above assignment is an EC&R assignment for class- $l^{+}$ when considering the dual weight $1$ for all constraints except the bound constraint in $\mathcal{J}$ which has a dual weight of $2$ in the aggregation. Specifically, we aggregate the base constraint $y_{1}x_{8,4}-z_{l}\geq 0$ with weight $1$ , the positive flow-balance constraints at node $4$ that is $x_{4,1}+x_{4,3}-x_{8,4}\geq f_{4}$ with weight $y_{1}$ , the positive flow-balance constraints at node $3$ that is $x_{3,7}-x_{4,3}-x_{2,3}\geq f_{3}$ with weight $y_{1}$ , the negative flow-balance constraints at node $2$ that is $x_{6,2}-x_{2,1}-x_{2,3}\geq-f_{2}$ with weight $y_{2}$ , the positive flow-balance constraints at node $1$ that is $x_{1,5}-x_{4,1}-x_{2,1}\geq f_{1}$ with weight $1-y_{1}-y_{2}$ , the lower bound constraint for arc $(4,1)$ that is $x_{4,1}\geq 0$ with weight $2(1-y_{1}-y_{2})$ , and the upper bound constraint for arc $(2,3)$ that is $u_{2,3}-x_{2,3}\geq 0$ with weight $1-y_{1}-y_{2}$ . During this aggregation, six bilinear terms will be canceled, satisfying condition (C1) of the EC&R procedure. Further, at least one bilinear term for each constraint involved in the aggregation is canceled, satisfying condition (C2) of the EC&R procedure. Therefore, the above assignment is a valid EC&R assignment for class- $l^{+}$ . Next, we argue that the dual weight vector for this assignment corresponds to an extreme point of $\mathcal{C}^{l}$ in (3). For $\mathcal{S}$ in this example, the coefficient matrix of $\mathcal{C}^{l}$ , as depicted in (12), has 16 rows corresponding to bilinear terms $y_{j}x_{i}$ for all $j=1,2$ and $i\in\mathrm{A}$ . It is easy to verify that the columns corresponding to the six constraints in the above EC&R assignment are linearly independent. As a result, we can form a basis by adding to the above six columns 10 more linearly independent columns corresponding to the constraints used in the relaxation step for the bilinear terms remaining in the aggregated inequality together with the columns that complete the basis. The resulting basis corresponds to a basic feasible solution where all variables (interpreted as the dual weights for constraints involved in the EC&R procedure) are [math] or $1$ , except the one associated with the column representing the lower bound constraint for arc $(4,1)$ which has the dual weight equal to $2$ . Therefore, there exists extreme points of $\mathcal{C}^{l}$ with components that are not all [math] or $1$ .

According to the above observation, even though identifying the aggregation weights for the EC&R procedure for $\mathcal{S}$ is not as straightforward compared to that of $\mathcal{S}^{1}$ , we next show that a generalization of the tree structure can still be detected for a given EC&R assignment. First, we give a few definitions that will be used to derive these results.

Definition 3.14

Consider set $\mathcal{S}$ where $\Xi$ describes the network polytope corresponding to network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . We define a parallel network $\mathrm{G}^{j}=(\mathrm{V}^{j},\mathrm{A}^{j})$ for $j\in M$ to be a replica of $\mathrm{G}$ that represents the multiplication of flow variables $\bm{x}$ with $y_{j}$ during the aggregation procedure. To simplify presentation, we use the same node and arc notation across all parallel networks. For instance, for each node $v\in\mathrm{V}$ (resp. arc $a\in\mathrm{A}$ ), its replica $v$ belongs to $\mathrm{V}^{j}$ (resp. $a$ belongs to $\mathrm{A}^{j}$ ).

Definition 3.15

We say that a collection of subnetworks $\dot{\mathrm{G}}^{k}=(\dot{\mathrm{V}}^{k},\dot{\mathrm{A}}^{k})$ , for $k=1,\dotsc,r$ of parallel networks $\{\mathrm{G}^{j}\}_{j\in M}$ are vertically connected through the connection nodes $C_{v}\subseteq\mathrm{V}$ and connection arcs $C_{a}\subseteq\mathrm{A}$ if there exists an ordering $s_{1},s_{2},\dotsc,s_{r}$ of indices $1,\dotsc,r$ such that for each $i=1,\dotsc,r-1$ , there exits either (i) an arc $a\in C_{a}$ such that $a$ is incident to a node of $\dot{\mathrm{G}}^{s_{i+1}}$ and it is incident to a node of some subnetworks among $\dot{\mathrm{G}}^{s_{1}},\dotsc,\dot{\mathrm{G}}^{s_{i}}$ , or (ii) a set of nodes $v_{1},\dotsc,v_{p}\in C_{v}$ each adjacent to the previous one such that $v_{1}$ is adjacent to a node of $\dot{\mathrm{G}}^{s_{i+1}}$ and $v_{p}$ is adjacent to a node of some subnetworks among $\dot{\mathrm{G}}^{s_{1}},\dotsc,\dot{\mathrm{G}}^{s_{i}}$ . In this definition, if node $v_{1}$ is in $\dot{\mathrm{G}}^{s_{i+1}}$ , it counts as an adjacent node to the connection node $v_{1}$ .

Proposition 3.16

Consider set $\mathcal{S}$ with $\Xi$ that represents the network polytope corresponding to the network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . Let $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ be an EC&R assignment that leads to a non-trivial facet-defining inequality of $\mathop{\rm conv}(\mathcal{S})$ . Assume that $\cup_{j\in M}\mathcal{I}_{j}\neq\emptyset$ . For each $j\in M$ , define $\widetilde{\mathcal{I}}^{j}=\{i\in\mathrm{V}|i^{\pm}\in\mathcal{I}_{j}\}$ , $\widetilde{\mathcal{I}}=\{i\in\mathrm{V}|i^{\pm}\in\bar{\mathcal{I}}\}$ , and $\widetilde{\mathcal{J}}=\{i\in\mathrm{A}|i\in\mathcal{J}\cup\bar{\mathcal{J}}\}$ . Then, there exist forests $\bar{\mathrm{F}}^{j}$ in the parallel network $\mathrm{G}^{j}$ for $j\in M$ , each composed of trees $\bar{\mathrm{T}}^{j}_{k}$ for $k\in\Gamma_{j}$ , where $\Gamma_{j}$ is an index set, such that

(i)

the forest $\bar{\mathrm{F}}^{j}$ is composed of the nodes in $\widetilde{\mathcal{I}}^{j}$ for all $j\in M$ ,

(ii)

the collection of the trees $\bar{\mathrm{T}}^{j}_{k}$ for all $k\in\Gamma_{j}$ and all $j\in M$ are vertically connected through connection nodes $\widetilde{\mathcal{I}}$ and connection arcs $\widetilde{\mathcal{J}}$ ,

(iii)

the collection of all nodes in $\bar{\mathrm{F}}^{j}$ for all $j\in M$ together with the connection nodes $\widetilde{\mathcal{I}}$ form a tree in $\mathrm{G}$ , which has at least one incident node to each connection arc in $\widetilde{\mathcal{J}}$ .

Proof 3.17

Proof. We show the result by proving conditions (i)–(iii). First, we may assume that the given EC&R assignment corresponds to class- $l^{\pm}$ for some $l\in K$ . Since $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ leads to a non-trivial facet-defining inequality of $\mathop{\rm conv}(\mathcal{S})$ , its corresponding dual weights in the aggregation should represent an extreme point of $\mathcal{C}^{l}$ defined in (3). This extreme point is associated with a basis in the coefficient matrix (12). In this basis, the subset of the column block that contains $E^{{}^{\intercal}}$ in the row block $j$ represents the flow-balance constraints (multiplied with $y_{j}$ ) for the nodes $i\in\widetilde{\mathcal{I}}^{j}$ , which can be viewed as the selected nodes in the parallel network $\mathrm{G}^{j}$ for $j\in M$ . Further, the rows in the row block $j$ represent the flow variables (multiplied with $y_{j}$ ) for each arc in $\mathrm{G}$ , which can be viewed as an arc in the parallel network $\mathrm{G}^{j}$ . We may reorder the columns and rows of this basis corresponding to each parallel network $\mathrm{G}^{j}$ to obtain a diagonal formation composed of diagonal blocks $E^{{}^{\intercal}}_{j,k}$ for $k$ in an index set $\Gamma_{j}$ . It follows from this structure that the nodes corresponding to the columns of $E^{{}^{\intercal}}_{j,k}$ in the parallel network $\mathrm{G}^{j}$ are connected via arcs of $\mathrm{G}^{j}$ represented by the matrix rows. Therefore, these nodes can form a tree $\bar{\mathrm{T}}^{j}_{k}$ for $k\in\Gamma_{j}$ , the collection of which represents a forest $\bar{\mathrm{F}}^{j}$ for all $j\in M$ , satisfying condition (i) of the proposition statement.

For condition (ii), considering the aforementioned diagonal block structure and representing the subset of column blocks of (12) containing $\bar{I}^{j}$ and $I$ in the basis by one block with $\pm$ sign (as only one of them can be selected in the basis), we can write the resulting system of equations for the associated basic feasible solution as follows

[TABLE]

where the last column in the basis corresponds to the constraints that have weights [math] in the basic feasible solution and are added to complete the basis, and where the last row represents all bilinear terms that do not appear in any constraints during aggregation. Further, the row block next to the last row corresponds to the bilinear terms that appear during aggregation but not in any of the selected flow-balance constraints; the matrix $\pm\bar{I}^{1,\dotsc,m}$ denotes the bilinear constraints in $\mathcal{S}$ that contain these bilinear terms and could be used during the relaxation step. In the above basis, the column block that contains $-E^{{}^{\intercal}}$ represents the flow-balance constraints (multiplied with $1-\sum_{j\in M}y_{j}$ ) corresponding to the nodes in $\widetilde{\mathcal{I}}$ . Similarly, the column block that contains $\pm I$ in all rows represents the bound constraints on the flow variables (multiplied with $1-\sum_{j\in M}y_{j}$ ) corresponding to the arcs in $\widetilde{\mathcal{J}}$ . We refer to the column group composed of the columns of $E^{{}^{\intercal}}_{j,k}$ for any $j\in M$ and $k\in\Gamma_{j}$ as the column group representing the nodes of the tree $\bar{\mathrm{T}}^{j}_{k}$ . It is clear from the diagonal structure of the submatrix containing $E^{{}^{\intercal}}_{j,k}$ that the column groups representing the nodes of $\bar{\mathrm{T}}^{j}_{k}$ are all arc disjoint, i.e., there are no two columns from different groups that have a nonzero element in the same row. We claim that there exists an ordering $(s_{1},r_{1}),(s_{2},r_{2}),\dotsc,(s_{h},r_{h})$ of the pairs $(j,k)$ for all $j\in M$ and $k\in\Gamma_{j}$ such that for each column group representing the nodes of $\bar{\mathrm{T}}^{s_{i}}_{r_{i}}$ , for all $i=2,\dotsc,h$ , there exists either (i) a column from the column blocks corresponding to $\widetilde{\mathcal{J}}$ that has a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{i},r_{i}}$ and a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{t},r_{t}}$ for some $t\in\{1,\dotsc,i-1\}$ , or (ii) a sequence of columns in the column block corresponding to $\widetilde{\mathcal{I}}$ , each sharing a nonzero element in a common row with the previous one, such that the first column has a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{i},r_{i}}$ and the last column has a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{t},r_{t}}$ for some $t\in\{1,\dotsc,i-1\}$ . Assume by contradiction that no such ordering exists. Therefore, we can partition the rows in the first $m+1$ row blocks of (28) in such a way that no two rows in all columns composed of the columns in $E^{{}^{\intercal}}_{j,k}$ for all $j\in M$ and $k\in\Gamma_{j}$ and those corresponding to the columns of $\widetilde{\mathcal{I}}$ and $\widetilde{\mathcal{J}}$ have all their nonzero elements in the rows of one of these partitions. In this case, considering that the column blocks composed of $\pm I$ and those composed of $\pm\bar{I}^{j}$ have exactly one nonzero element in the basis, we can compactly rewrite the system of equations for the basic feasible solution as follows:

[TABLE]

where the first and second row blocks respectively correspond to the first and second partitions discussed above. In (29), $\bm{e}^{l}$ is a unit vector whose elements are all zero except that corresponding to the row representing $y_{j^{\prime}}x_{i^{\prime}}$ for some $i^{\prime},j^{\prime}$ that satisfy $A^{l}_{j^{\prime}i^{\prime}}=1$ , which is equal to $1$ . We may assume without the loss of generality that the row containing this nonzero element in $\bm{e}^{l}$ belongs to the first row block. All these columns except the ones in the last column block have positive weights because the associated constraints are assumed to be used in the aggregation. These weights are denoted by $\bm{+}$ in the first two row blocks of the vector multiplied with this matrix. It is now easy to verify that the linear combination of the columns in the second column block of the basis matrix with positive weights yields the zero vector. This shows that the columns are linearly dependent, a contradiction. Now, consider the ordering $(s_{1},r_{1}),(s_{2},r_{2}),\dotsc,(s_{h},r_{h})$ described above. It follows that for each $i=2,\dotsc,h$ , there exists either (i) a column from the column block corresponding to $\widetilde{\mathcal{J}}$ that has a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{i},r_{i}}$ and a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{t},r_{t}}$ for some $t\in\{1,\dotsc,i-1\}$ , or (ii) a sequence of columns in the column block corresponding to $\widetilde{\mathcal{I}}$ , each sharing a nonzero element in a common row with the previous one, such that the first column has a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{i},r_{i}}$ and the last column has a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{t},r_{t}}$ for some $t\in\{1,\dotsc,i-1\}$ . First, consider the case (i) in the above either-or argument holds for a column $k\in\widetilde{\mathcal{J}}$ . This column has nonzero elements in the rows representing arc $k$ in all subnetworks $\mathrm{G}^{j}$ for $j\in M$ . Matrix $E^{{}^{\intercal}}_{s_{i},r_{i}}$ has a nonzero element in row $k$ if the tree $\bar{\mathrm{T}}^{s_{i}}_{r_{i}}$ has an incident node to arc $k$ . we conclude that $\bar{\mathrm{T}}^{s_{i}}_{r_{i}}$ and $\bar{\mathrm{T}}^{s_{t}}_{r_{t}}$ have at least one node incident to $k$ , satisfying criterion (i) in Definition 3.15. Second, consider the case (ii) in the above either-or argument holds for a sequence $k_{1},\dotsc,k_{p}$ of the nodes corresponding to $\widetilde{\mathcal{I}}$ where $p\leq|\widetilde{\mathcal{I}}|$ . Any such column, say $k_{1}$ , has nonzero elements in the rows representing the arcs that are incident to node $k_{1}$ in all parallel networks $\mathrm{G}^{j}$ for $j\in M$ . Therefore, since each column contains a nonzero element in a common row with the previous one, the nodes corresponding to these columns must be adjacent to one another in $\mathrm{G}$ . Further, since the column corresponding to $k_{1}$ has a nonzero element in a row corresponding to a row of $E^{{}^{\intercal}}_{s_{i},r_{i}}$ , we conclude that $k_{1}$ is adjacent to a node in $\bar{\mathrm{T}}^{s_{i}}_{r_{i}}$ , which means that either $k_{1}$ belongs to this tree, or it is adjacent to a node of this tree. A similar argument can be made about node $k_{p}$ and the tree $\bar{\mathrm{T}}^{s_{t}}_{r_{t}}$ . This satisfies criterion (ii) of Definition 3.15. This proves condition (ii) of the proposition statement due to Definition 3.15.

For condition (iii), we show there exists a sequence $v_{1},\dotsc,v_{q}$ of all the nodes in $\cup_{j\in M}\widetilde{\mathcal{I}}^{j}\cup\widetilde{\mathcal{I}}$ , where $q=|\cup_{j\in M}\widetilde{\mathcal{I}}^{j}\cup\widetilde{\mathcal{I}}|$ and $v_{1}\in\cup_{j\in M}\widetilde{\mathcal{I}}^{j}$ , such that every node $v_{i}$ is adjacent to at least one node in $v_{1},\dotsc,v_{i-1}$ for every $i=2,\dotsc,q$ . We may assume that $v_{1}$ is incident to arc $i^{\prime}$ defined previously that is associated with the base equality $l$ . For other cases, where $i^{\prime}$ is not incident to any nodes in $\cup_{j\in M}\widetilde{\mathcal{I}}^{j}$ , the argument will be similar with an adjustment of the partitions described below. It follows from the previously proven conditions (i) and (ii) of the problem statement, as well as Definition 3.15 that there exists a sequence $\bar{v}_{1},\dotsc,\bar{v}_{p}$ of all the nodes in $\cup_{j\in M}\widetilde{\mathcal{I}}^{j}\cup\widehat{\mathcal{I}}$ for some $\widehat{\mathcal{I}}\subseteq\widetilde{\mathcal{I}}$ such that $\bar{v}_{i}$ is adjacent to at least one node in $\bar{v}_{1},\dotsc,\bar{v}_{i-1}$ for every $i=2,\dotsc,p$ . We claim that there exists a sequence $\hat{v}_{1},\dotsc,\hat{v}_{q-p}$ of all the nodes in $\widetilde{\mathcal{I}}\setminus\widehat{\mathcal{I}}$ such that $\hat{v}_{i}$ is adjacent to at least one node in $\bar{v}_{1},\dotsc,\bar{v}_{p},\hat{v}_{1},\dotsc,\hat{v}_{i-1}$ for every $i=1,\dotsc,q-p$ . Assume by contradiction that no such sequence exists. Then, we can use an argument similar to that of case (ii) above to partition the rows of (28) in such a way that all columns corresponding to the nodes in $\bar{v}_{1},\dotsc,\bar{v}_{p},\hat{v}_{1},\dotsc,\hat{v}_{t}$ have all their nonzero elements in the first partition, and the columns corresponding to all the remaining nodes $\hat{v}_{t+1},\dotsc,\hat{v}_{q-p}$ have all their nonzero elements in the second partition. Then, a similar argument to that following (29) will show that the columns in the second group are linearly dependent, a contradiction. The case for the second part of the statement regarding the connection arcs in $\widetilde{\mathcal{J}}$ can be shown similarly. \Halmos

Although Proposition 3.16 shows that an EC&R assignment corresponds to a forest structure in the parallel networks created from the underlying network in $\mathcal{S}$ , the converse result—similar to the one presented for set $\mathcal{S}^{1}$ —does not hold here. More specifically, a forest structure that satisfies the conditions of Proposition 3.16 does not necessarily lead to a valid EC&R assignment, and even when it does, the calculation of the aggregation weights to satisfy the EC&R conditions is not as straightforward; see Example 3.18 below.

Example 3.18

Consider set $\mathcal{S}$ where $\Xi$ describes the network polytope corresponding to network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ in Figure 1, and $\Delta=\{(y_{1},y_{2})\in{\mathbb{R}}^{2}_{+}|x_{1}+x_{2}\leq 1\}$ . Select class- $l^{+}$ corresponding to the base equality $y_{1}x_{1,5}-z_{l}=0$ . In the parallel network $\mathrm{G}^{1}$ , we select the forest $\bar{\mathrm{F}}^{1}$ composed of the tree $\bar{\mathrm{T}}^{1}_{1}$ with the node set $\widetilde{\mathcal{I}}^{1}=\{1,2\}$ . In the parallel network $\mathrm{G}^{2}$ , we select the forest $\bar{\mathrm{F}}^{2}$ composed of the tree $\bar{\mathrm{T}}^{2}_{1}$ with the node set $\widetilde{\mathcal{I}}^{2}=\{2,6\}$ . We select the connection node set $\widetilde{\mathcal{I}}=\{6\}$ , and the connection arc set $\widetilde{\mathcal{J}}=\emptyset$ . It is easy to verify that these sets satisfy the conditions (i)–(iii) of Propositions 3.16. However, we cannot find an aggregation weight for the flow-balance constraints corresponding to the nodes in the above sets that yields a cancellation of at least 5 bilinear terms. As a result, there is no EC&R assignment that matches the considered forest structure.

A common way to circumvent the above-mentioned difficulty in obtaining valid EC&R assignments and their aggregation weights is to aim at a special class of EC&R assignments with more specific attributes that can be used to strengthen the connection between an EC&R assignment and its corresponding network structure. An important example of such class is the class of EC&R assignments that are obtained through pairwise cancellation. In this procedure, each cancellation of bilinear terms is obtained by aggregating two constraints. This definition includes the bilinear terms that are canceled during the relaxation step, i.e., the constraint used to relax the remaining bilinear terms counts as one of the two constraints in the preceding statement. Following this procedure, the aggregation weight for each constraint can be determined successively as the constraint is added to the assignment to ensure the satisfaction of the EC&R conditions. The next result shows that the aggregation weights for all constraints used in the EC&R assignments obtained through pairwise cancellation are $1$ .

Proposition 3.19

Consider set $\mathcal{S}$ where $\Xi$ describes the network polytope corresponding to network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . Let $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ be an EC&R assignment for class- $l^{\pm}$ for some $l\in K$ corresponding to pairwise cancellation. Then, the aggregation weights for all constraints used in this assignment are $1$ .

Proof 3.20

Proof. Let $\bar{\bm{\pi}}^{l}$ be the solution vector for the system of equations (3) of $\mathcal{C}^{l}$ corresponding to the aggregation weights of the given EC&R assignment. We may rewrite this system of equations as follows by rearranging its rows and columns.

[TABLE]

In the coefficient matrix of (30), the first row block represents the bilinear terms that are canceled during aggregation. The second row block corresponds to the remaining bilinear terms in the aggregated inequality that are relaxed in the last step of the EC&R procedure. The last row block represents all the bilinear terms that are not involved in the aggregation procedure. Further, in this matrix, the first column block corresponds to the constraints used in the aggregation, whose aggregation weights in the solution vector $\bar{\bm{\pi}}^{l}$ are denoted by $\bar{\bm{\pi}}^{l}_{1}$ . The second column block corresponds to the variable bound constraints in $\Xi$ as well as the bilinear constraints in $\mathcal{S}$ used in the EC&R procedure to relax the remaining bilinear terms in the aggregated inequality, whose weights in the solution vector $\bar{\bm{\pi}}^{l}$ are denoted by $\bar{\bm{\pi}}^{l}_{2}$ . The last column block represents all other constraints that are not used during the EC&R procedure and their weights in the solution vector $\bar{\bm{\pi}}^{l}$ are zero. Finally, $\bm{e}^{l}$ on the right-hand-side of this system is a unit vector whose elements are all zeros except that corresponding to the row representing $y_{j^{\prime}}x_{i^{\prime}}$ for some $i^{\prime},j^{\prime}$ that satisfy $A^{l}_{j^{\prime}i^{\prime}}=1$ , which is equal to $1$ . It is clear that this row belongs to the first row block since according to the EC&R condition (C2), the bilinear term in the base equality $l$ must be canceled during the aggregation procedure when the assignments are not empty. It follows from the equation (30) that $P_{1}\bar{\bm{\pi}}^{l}_{1}=\pm\bm{e}^{l}$ . Next, we analyze the structure of $P_{1}$ . Note that all elements of $P_{1}$ belong to $\{0,-1,1\}$ because it is a submatrix of (12) that represents the coefficients of the constraints in $\mathcal{S}$ . Considering that the columns of $P_{1}$ represent the constraints used in the aggregation except the base equality (as that constraint has been moved to the right-hand-side to form $\mathcal{C}^{l}$ ), and that the rows of $P_{1}$ correspond to the canceled bilinear terms during aggregation, according to condition (C1) of EC&R, we conclude that the number of rows of $P_{1}$ is no smaller that the number of columns of $P_{1}$ . Further, it follows from condition (C2) of EC&R that each constraint used in the aggregation (after being multiplied with its corresponding weight) will have at least one bilinear term canceled, which implies that each column of $P_{1}$ has at least one nonzero element. The assumption of pairwise cancellation for the given EC&R assignment implies that each canceled bilinear term corresponding to the rows of $P_{1}$ are obtained through aggregation of exactly two constraints. As a result, each row of $P_{1}$ must contain exactly two nonzero elements, except for the row corresponding to the bilinear term $y_{j^{\prime}}x_{i^{\prime}}$ that appears in the base equality $y_{j^{\prime}}x_{i^{\prime}}-z_{l}=0$ which must have only one nonzero element because the weight of the base equality has been fixed at $\pm 1$ and its column has been moved to the right-hand-side of the equation captured by $\pm\bm{e}^{l}$ ; see the derivation of (3). Therefore, we may rearrange the rows and columns of the matrices in this equation to obtain the form:

[TABLE]

where $\bar{\bar{\bm{\pi}}}^{l}_{1}$ is composed of the elements of $\bar{\bm{\pi}}^{l}_{1}$ that are rearranged to match the rearrangement of columns of $P_{1}$ in the above form, and where the first row corresponds to the bilinear term $y_{j^{\prime}}x_{i^{\prime}}$ so that the right-hand-side vector becomes $\pm\bm{e}^{1}$ . It follows from the above discussion about the structure of $P_{1}$ and equation (31) that all components of $\bar{\bar{\bm{\pi}}}^{l}_{1}$ must be equal to 1 as they need to be nonnegative. Finally, for the equations in the second row block of (30), we have that $P_{2}\bar{\bm{\pi}}^{l}_{1}\pm I\bar{\bm{\pi}}^{l}_{2}=\bm{0}$ . It follows from the pairwise cancellation assumption that each row of $P_{2}$ contains exactly one nonzero element as it corresponds to a remaining bilinear term in the aggregation inequality. Since all of the elements in $P_{2}$ belong to $\{0,-1,1\}$ , and all the components in $\bar{\bm{\pi}}^{l}_{1}$ are equal to 1, it must hold that $\bar{\bm{\pi}}^{l}_{2}=\bm{1}$ . \Halmos

Remark 3.21

For the case with $m=1$ , as described in the proof of Proposition 3.6, there are two (resp. three) possible scenarios for constraints that could be used in the aggregation to cancel a bilinear term $y_{1}x_{i}$ for any $i\in N\setminus\{l\}$ (resp. for $i=l$ ). Since the aggregation weights for all constraints are $1$ in this case (see Proposition 3.2), we conclude that each cancellation is obtained through aggregation of exactly two constraints. Further, any remaining bilinear term in the aggregated inequality corresponds to an arc that is incident to exactly one node of the tree associated with the EC&R assignment (see Proposition 3.6), which implies that each such bilinear term appears in exactly one constraint during aggregation. As a result, all EC&R inequalities for the case with $m=1$ can be obtained through pairwise cancellation.

Although the EC&R inequalities obtained through pairwise cancellation do not necessarily produce a full convex hull description for $\mathcal{S}$ , the result of Proposition 3.19 provides three important advantages: (i) it generalizes the convexification results for the case with $m=1$ as described in Remark 3.21; (ii) it can produce inequalities stronger than those obtained by applying Theorem 3.8 to relaxations of $\mathcal{S}$ that contain one $y$ variable at a time, because it considers all the $y$ variables in their original simplex set $\Delta_{m}$ ; and (iii) it enables us to derive explicit EC&R inequalities cognizant of the underlying network structure without the need to search for the aggregation weights that satisfy the EC&R conditions as will be shown in the sequel. These advantages are corroborated by the computational experiments presented in Section 4. The next proposition shows that the pairwise cancellation property provides more information about the forest structure presented in Proposition 3.16.

Proposition 3.22

Consider the setting of Proposition 3.16, and assume that the EC&R assignment $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ has the pairwise cancellation property. Further, let this assignment correspond to a class- $l^{\pm}$ for some $l\in K$ such that $A^{l}_{j^{\prime}i^{\prime}}=1$ for some $(i^{\prime},j^{\prime})\in N\times M$ . Then, in addition to the outcome of Proposition 3.16, we have that

(i)

arc $i^{\prime}$ is either in $\widetilde{\mathcal{J}}$ or incident to exactly one node in $\widetilde{\mathcal{I}}\cup\widetilde{\mathcal{I}}^{j^{\prime}}$ , but not both,

(ii)

each arc in $\widetilde{\mathcal{J}}$ is incident to at most one node in $\widetilde{\mathcal{I}}\cup\widetilde{\mathcal{I}}^{j}$ for each $j\in M$ ,

(iii)

each node in $\widetilde{\mathcal{I}}\cap\widetilde{\mathcal{I}}^{j}$ , for $j\in M\setminus\{j^{\prime}\}$ (resp. $j=j^{\prime}$ ), is adjacent to no other nodes in that set and no arcs in $\widetilde{\mathcal{J}}$ (resp. $\widetilde{\mathcal{J}}\cup\{i^{\prime}\}$ ).

Proof 3.23

Proof. For case (i), it follows from condition (C2) of the EC&R procedure that the bilinear term $y_{j^{\prime}}x_{i^{\prime}}$ in the base equality must be canceled during aggregation. Further, according to the pairwise cancellation property, there must be exactly one constraint in the aggregation in addition to the base equality that would contain $y_{j^{\prime}}x_{i^{\prime}}$ after multiplication with the corresponding dual weight. There are two possible scenarios. The first possibility is that the bound constraints for $x_{i^{\prime}}$ are used in the aggregation, which implies that arc $i^{\prime}$ is a connection arc and belongs to $\widetilde{\mathcal{J}}$ . The second possibility is that the flow-balance constraint at either node $t(i^{\prime})$ or $h(i^{\prime})$ , but not both, is used in the aggregation, which implies that $i^{\prime}$ is incident to exactly one node in $\widetilde{\mathcal{I}}\cup\widetilde{\mathcal{I}}^{j^{\prime}}$ .

For case (ii), consider an arc $i\in\widetilde{\mathcal{J}}$ . Therefore, either of the bound constraints $x_{i}\geq 0$ or $u_{i}-x_{i}\geq 0$ is used in the aggregation with weight $1-\sum_{j\in M}y_{j}$ . It follows from the pairwise cancellation property that, for each $j\in M$ , there can be at most one additional constraint in the aggregation that contains a term $y_{j}x_{i}$ . The only possibility for such a constraint is the flow-balance constraint at either node $t(i^{\prime})$ or $h(i^{\prime})$ , but not both. We conclude that $i$ is incident to at most one node in $\widetilde{\mathcal{I}}\cup\widetilde{\mathcal{I}}^{j}$ for each $j\in M$ .

For case (iii), consider a node $i\in\widetilde{\mathcal{I}}\cap\widetilde{\mathcal{I}}^{j}$ for some $j\in M\setminus\{j^{\prime}\}$ . Therefore, the aggregation contains the (positive or negative) flow-balance constraint at node $i$ multiplied with $1-\sum_{j\in M}y_{j}$ due to $i\in\widetilde{\mathcal{I}}$ , together with the (positive or negative) flow-balance constraint at node $i$ multiplied with $y_{j}$ due to $i\in\widetilde{\mathcal{I}}^{j}$ . Therefore, the bilinear terms $y_{j}x_{k}$ for all $k\in\delta^{+}(i)\cup\delta^{-}(i)$ already appear in two constraints, which implies that they cannot appear in any other constraints during aggregation. As a result, the bound constraints for each variable $x_{k}$ corresponding to arc $k$ cannot be included in $\widetilde{\mathcal{J}}$ . Similarly, the flow-balance constraint at node $h(k)$ for any $k\in\delta^{+}(i)$ and at node $t(k)$ for any $k\in\delta^{-}(i)$ cannot be included in the aggregation, which implies that $i$ cannot be adjacent to any other nodes in $\widetilde{\mathcal{I}}\cap\widetilde{\mathcal{I}}^{j}$ . The proof for the case where $j=j^{\prime}$ follows from a similar argument. \Halmos

As noted earlier, an important consequence of the pairwise cancellation property is providing the ability to derive the converse statement to those of Proposition 3.16 and 3.22, which identifies EC&R assignments based on a special forest structure in the underlying network.

Theorem 3.24

Consider set $\mathcal{S}$ with $\Xi$ that represents the network polytope corresponding to the network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . Let $\bar{\mathrm{F}}^{j}$ , for each $j\in M$ , be a forest in the parallel network $\mathrm{G}^{j}$ , composed of trees $\bar{\mathrm{T}}^{j}_{k}$ for $k\in\Gamma_{j}$ , where $\Gamma_{j}$ is an index set, that satisfies the conditions (i)–(iii) of Propositions 3.16 and 3.22 with the corresponding node sets $\widetilde{\mathcal{I}}^{j}$ , the connection node set $\widetilde{\mathcal{I}}$ , the connection arc set $\widetilde{\mathcal{J}}$ , and the class $l$ . Then, the assignment $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ obtained from Algorithm 1 is an EC&R assignment for class- $l^{\pm}$ .

Proof 3.25

Proof. First, we argue that conditions (i)–(iii) of Proposition 3.16 imply that each member of the sets $\widetilde{\mathcal{I}}$ , $\widetilde{\mathcal{J}}$ , and $\widetilde{\mathcal{I}}^{j}$ for $j\in M$ receives a label assignment through the steps of Algorithm 1, i.e., the member is added to set $\mathtt{D}$ defined in that algorithm. It follows from condition (i) of Proposition 3.16 that once a member of the node subset in $\widetilde{\mathcal{I}}^{j}$ , for each $j\in M$ , that represents a tree $\bar{\mathrm{T}}^{j}_{k}$ with $k\in\Gamma_{j}$ is added to $\mathtt{D}$ , all the remaining nodes in $\bar{\mathrm{T}}^{j}_{k}$ are eventually added to $\mathtt{D}$ because of the loop in lines 11–13 in the algorithm, as all nodes of the tree are connected. Condition (ii) of Proposition 3.16 implies that all trees $\bar{\mathrm{T}}^{j}_{k}$ for $k\in\Gamma_{j}$ and $j\in M$ are connected through an appropriate sequence of the tree nodes, the connection nodes in $\widetilde{\mathcal{I}}$ , and the connection arcs in $\widetilde{\mathcal{J}}$ . Consequently, the loops in lines 10–44 of the algorithm ensure that each member of these sets is visited following that sequence and becomes added to $\mathtt{D}$ . Further, condition (iii) of Proposition 3.16 suggests that each member in the sets $\widetilde{\mathcal{I}}$ and $\widetilde{\mathcal{J}}$ is connected to the subgraph composed of the set of all tree nodes in $\widetilde{\mathcal{I}}^{j}$ and their associated connection nodes and connection arcs. As a result, there exists a sequence of adjacent nodes and arcs that lead to each member of $\widetilde{\mathcal{I}}$ and $\widetilde{\mathcal{J}}$ , thereby getting added to $\mathtt{D}$ .

Second, we show that each bilinear term created during the aggregation can appear in at most two constraints. There are four cases. In case 1, consider the bilinear term $y_{j^{\prime}}x_{i^{\prime}}$ that appears in the base equality $l$ . Condition (i) of Proposition 3.22 implies that this bilinear term can appear in exactly one other constraint, which could be either the bound constraint on variable $x_{i^{\prime}}$ (which would be included in $\widetilde{\mathcal{J}}$ ) or the flow-balance constraint at one of the incident nodes to $i^{\prime}$ (which would be included in $\widetilde{\mathcal{I}}^{j^{\prime}}\cup\widetilde{\mathcal{I}}$ ). In case 2, consider a bilinear term $y_{j}x_{i}$ , for some $j\in M$ , that appears in the bound constraint on variable $x_{i}$ for any arc $i\in\widetilde{\mathcal{J}}$ . Condition (ii) of Proposition 3.22 implies that this bilinear term can appear in at most one other constraint, which could be the flow-balance constraint at one of the incident nodes to $i$ (which would be included in $\widetilde{\mathcal{I}}^{j}\cup\widetilde{\mathcal{I}}$ ). In case 3, consider a bilinear term $y_{j}x_{i}$ , for some $j\in M$ , that appears in the flow-balance constraint at an incident node of arc $i$ after being multiplied with both $y_{j}$ (i.e., the node being in $\widetilde{\mathcal{I}}^{j}$ ) and $1-\sum_{j\in M}y_{j}$ (i.e., the node being in $\widetilde{\mathcal{I}}$ ). Condition (iii) of Proposition 3.22 implies that this bilinear term cannot appear in any other constraints during aggregation. In case 4, consider a bilinear term $y_{j}x_{i}$ , for some $j\in M$ , that appears in the flow-balance constraint at an incident node of arc $i$ that is not in $\widetilde{\mathcal{I}}^{j}\cap\widetilde{\mathcal{I}}$ . It follows from condition (iii) of Proposition 3.16 that this bilinear term can appear in at most one other constraint because of the tree structure of all the nodes in $\cup_{j\in M}\widetilde{\mathcal{I}}^{j}\cup\widetilde{\mathcal{I}}$ .

Third, we discuss that, for any $k\in\mathtt{D}$ that has been newly added to this set, its label value has been determined through lines 10–44 of Algorithm 1 in such a way that, for a member $i\in\cup_{j\in M}\widetilde{\mathcal{I}}^{j}\cup\widetilde{\mathcal{I}}\cup\widetilde{\mathcal{J}}$ that has been previously added to $\mathtt{D}$ and is adjacent/incident to $i$ , the bilinear term that commonly appears in the weighted constraints corresponding to both $i$ and $k$ is canceled. For instance, consider the case where $i\in\widetilde{\mathcal{I}}$ (line 22 of the algorithm) and $k\in\widetilde{\mathcal{I}}^{j}$ for some $j\in M$ is an adjacent node to $i$ (line 26 of the algorithm). Assume that $\mathtt{l}(i)=+$ , and that arc $a\in\mathrm{A}$ is such that $t(a)=i$ and $h(a)=k$ . It follows from line 27 of the algorithm that $\mathtt{l}(k)=-$ . Considering the assignment rule in lines 48 and 52 of the algorithm, we should aggregate the constraint $\sum_{r\in\delta^{+}(i)\setminus\{p\}}x_{r}-\sum_{r\in\delta^{-}(i)}x_{r}+x_{p}\geq f_{i}$ with weight $1-\sum_{j\in M}y_{j}$ , together with the constraint $-\sum_{r\in\delta^{+}(k)}x_{r}+\sum_{r\in\delta^{-}(k)\setminus\{p\}}x_{r}+x_{p}\geq-f_{k}$ with weight $y_{j}$ , which results in the cancellation of the bilinear term $y_{j}x_{p}$ . A similar argument can be made for any other possible case in Algorithm 1.

Combining all the results shown in the previous parts, i.e., (I) each member of the sets $\widetilde{\mathcal{I}}$ , $\widetilde{\mathcal{J}}$ , and $\widetilde{\mathcal{I}}^{j}$ for $j\in M$ receives a label assignment and is added to $\mathtt{D}$ ; (II) each bilinear term created during the aggregation can appear in at most two constraints; and (III) for any $k\in\mathtt{D}$ , its label value is determined in such a way that the bilinear term that is common between the weighted constraints corresponding to $i$ and a previously added member $k$ in $\mathtt{D}$ is canceled, we conclude that at least $|\widetilde{\mathcal{I}}|+|\widetilde{\mathcal{J}}|+\sum_{j\in M}|\widetilde{\mathcal{I}}^{j}|$ bilinear terms will be canceled during aggregation in the desired assignment $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ . This satisfies the EC&R conditions (C1). Finally, the above argument also implies that each flow-balance constraint at the nodes in $\cup_{j\in M}\widetilde{\mathcal{I}}^{j}\cup\widetilde{\mathcal{I}}$ , and each variable bound constraint for the arcs in $\widetilde{\mathcal{I}}$ will have at least one of their bilinear terms (after being multiplied with appropriate weights) canceled because each such node or arc will eventually be added to $\mathtt{D}$ when it receives a label for the desired cancellation. This satisfies the EC&R condition (C2). We conclude that $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ is an EC&R assignment. \Halmos

In view of Theorem 3.24, once we identify a forest structure with the desired conditions, we can use the steps in Algorithm 1 to determine the weight of each constraint in the corresponding EC&R assignment by following a path that starts from the arc associated with the base equality and reaches the node or arc associated with that constraint. We illustrate this approach in the following example.

Example 3.26

Consider set $\mathcal{S}$ with $m=2$ and $\Xi$ that represents the primal network model corresponding to the graph $\mathrm{G}=(\mathrm{V},\mathrm{A})$ shown in Figure 1. Similarly to Example 3.10, we refer to each arc in this network as a pair $(i,j)$ of its tail node $i$ and its head node $j$ , and denote its corresponding flow variable as $x_{i,j}$ . Assume that we are interested in finding EC&R assignments for class- $l^{+}$ where the base equality $l$ contains the bilinear term $y_{1}x_{1,5}$ , i.e., $i^{\prime}=(1,5)$ and $j^{\prime}=1$ . According to Theorem 3.8, we need to identify a forest structure that satisfies the conditions (i)–(iii) of Propositions 3.16 and 3.22. In the parallel network $\mathrm{G}^{1}$ , we select the forest $\bar{\mathrm{F}}^{1}$ composed of the tree $\bar{\mathrm{T}}^{1}_{1}$ with the node set $\{1,2,6\}$ and the tree $\bar{\mathrm{T}}^{1}_{2}$ with the node set $\{8\}$ . In the parallel network $\mathrm{G}^{2}$ , we select the forest $\bar{\mathrm{F}}^{2}$ composed of the tree $\bar{\mathrm{T}}^{2}_{1}$ with the node set $\{1,4\}$ . Therefore, we can form the set $\widetilde{\mathcal{I}}^{1}=\{1,2,6,8\}$ and $\widetilde{\mathcal{I}}^{2}=\{1,4\}$ . We select the connection node set $\widetilde{\mathcal{I}}=\{3\}$ , and the connection arc set $\widetilde{\mathcal{J}}=\{(8,4)\}$ . It is easy to verify that these sets satisfy the conditions (i)–(iii) of Propositions 3.16 and 3.22. Next, we determine the label of each node and arc in the above sets through applying Algorithm 1. According to line 2 of this algorithm, we set $\mathtt{l}(1,5)=+$ in parallel network $\mathrm{G}^{1}$ , and select $k=t(1,5)=1\in\widetilde{\mathcal{I}}^{1}$ . It follows from line 5 of the algorithm that $\mathtt{l}(1)=-$ and $k$ is added to $\mathtt{D}$ . Following lines 10–13, we obtain for $\widetilde{\mathcal{I}}^{1}$ that $\mathtt{l}(2)=\mathtt{l}(6)=-$ , and for $\widetilde{\mathcal{I}}$ that $\mathtt{l}(3)=+$ . Then, from lines 26–28 of Algorithm 1, we deduce for $\widetilde{\mathcal{I}}^{2}$ that $\mathtt{l}(4)=-$ , and from lines 11–13 for $\widetilde{\mathcal{I}}^{2}$ , we obtain that $\mathtt{l}(1)=-$ . Lines 32–34 imply that $\mathtt{l}(8,4)=-$ for $\widetilde{\mathcal{J}}$ . Lastly, we conclude from lines 38–40 that $\mathtt{l}(8)=-$ for $\widetilde{\mathcal{I}}^{1}$ . As a result, following lines 47–59 of the algorithm, we obtain the EC&R assignment $\big{[}\{1^{-},2^{-},6^{-},8^{-}\},\{1^{-},4^{-}\},\{3^{+}\}\big{|}\emptyset,\{(8,4)\}\big{]}$ for class- $l^{+}$ . Based on this assignment, we multiply the negative flow-balance constraints at nodes $1,2,6,8$ with $y_{1}$ , we multiply the negative flow-balance constraints at nodes $1,4$ with $y_{2}$ , we multiply the positive flow-balance constraint at node $3$ with $1-y_{1}-y_{2}$ , and we multiply the upper bound constraint on variable $x_{8,4}$ with $1-y_{1}-y_{2}$ , and aggregate them with the base bilinear equality corresponding to arc $(1,5)$ with weight 1 to obtain the aggregated inequality

[TABLE]

where $f_{i}$ denotes the supply/demand value at node $i$ , and $u_{i,j}$ denotes the upper bound for variable $x_{i,j}$ . Following Remark 3.1, we may relax each of the seven remaining bilinear terms into two possible linear expressions, leading to 128 total EC&R inequalities. If implemented inside of a separation oracle, we can use Remark 3.5 to find the most violated inequality among these 128 inequalities efficiently in linear time.

We conclude this section with a remark on the practical implementation of the proposed EC&R inequalities. While there is an efficient separation algorithm to find a separating EC&R inequality among those created from a given EC&R assignment as noted in Remark 3.5, the choice of the class of an EC&R assignment and its possible forest structure in the underlying network can lead to a large pool of candidates to consider during a branch-and-cut approach. Note that each EC&R inequality is obtained through an aggregation of the constraints of $\mathcal{S}$ with proper weights. In particular, given an EC&R assignment $\big{[}\mathcal{I}_{1},\dotsc,\mathcal{I}_{m},\bar{\mathcal{I}}\big{|}\mathcal{J},\bar{\mathcal{J}}\big{]}$ for class- $l^{\pm}$ , we aggregate the base inequality of the form $f_{l}(\bm{x},\bm{y},\bm{z})\geq 0$ with constraints of the general form $h(\bm{y})g(\bm{x})\geq 0$ , where $h(\bm{y})$ represents the aggregation weight that could be $y_{j}$ or $1-\sum_{j\in M}y_{j}$ , and where $g(\bm{x})\geq 0$ denotes a linear side constraint that could be the flow-balance or variable bound constraints. In most branch-and-cut approaches, the starting relaxation of the problem contains all linear side constraints on $\bm{x}$ and $\bm{y}$ . It follows that an optimal solution $(\bar{\bm{x}};\bar{\bm{y}};\bar{\bm{z}})$ of such relaxation that is to be separated satisfies $h(\bar{\bm{y}})g(\bar{\bm{x}})\geq 0$ for all valid choices of function $h(\bm{y})$ and constraint $g(\bm{x})\geq 0$ . Therefore, for the resulting aggregated inequality to be violated at a point $(\bar{\bm{x}};\bar{\bm{y}};\bar{\bm{z}})$ , we must have the base inequality violated at that point, i.e, $f_{l}(\bar{\bm{x}},\bar{\bm{y}},\bar{\bm{z}})<0$ . This observation can be used to select the class and sign of the EC&R assignment to be generated during separation process. To this end, we may sort the values $\Psi_{k}=|\bar{y}_{j}\bar{x}_{i}-\bar{z}_{k}|$ for all $(i,j,k)$ such that $A^{k}_{ji}=1$ , and choose class $k$ as that associated with largest $\Psi_{k}$ with the class sign $+$ if $\bar{y}_{j}\bar{x}_{i}-\bar{z}_{k}<0$ , and class sign $-$ otherwise. This perspective can shed light on the observation that the EC&R inequalities obtained from fewer aggregations tend to be more effective in practice as noted in **[8]** and also observed in our experiments in Section 4. Specifically, the addition of constraints $h(\bm{y})g(\bm{x})\geq 0$ in the aggregation can increase the left-hand-side value in the aggregated inequality when $h(\bar{\bm{y}})g(\bar{\bm{x}})>0$ , which could reduce the chances of obtaining a violated aggregated inequality.

Another observation that can be helpful for choosing the forest structures is considering the relaxation step in the EC&R procedure. As described in Remark 3.1, each remaining bilinear term $y_{j}x_{i}$ can be relaxed using either the bound constraints or the bilinear constraints. The former case is equivalent to aggregating the inequality with a constraint of the form $h(\bm{y})g(\bm{x})\geq 0$ where $h(\bm{y})=y_{j}$ and $g(\bm{x})\in\{x_{i}\geq 0,u_{i}-x_{i}\geq 0\}$ , for which the previous argument holds about achieving a violation. For the latter case, on the other hand, we aggregate the inequality with a bilinear constraint of the form $\pm(y_{j}x_{i}-z_{k})\geq 0$ for $i,j,k$ such that $A^{k}_{j,i}=1$ , which can potentially lead to a violation depending on the value of $\Psi_{k}=|\bar{y}_{j}\bar{x}_{i}-\bar{z}_{k}|$ . As a result, we might choose forest structures that contain the nodes incident to arcs $i\in\mathrm{A}$ corresponding to the most violated values in $|\bar{y}_{j}\bar{x}_{i}-\bar{z}_{k}|$ . In our computational experiments presented in Section 4, we use the above-mentioned heuristics in our separation oracle to efficiently select the class of EC&R assignments and their forest structures, which show promising results.

4 Computational Experiments

In this section, we present preliminary computational results to evaluate the impact of the EC&R cutting planes generated through the results of Section 3. We study several basic network structures, from both dense and sparse classes, that can be used to obtain different relaxations for any network problems by isolating those structures in the underlying model. These structures include bipartite, clique, and cycle, as shown in Figures 2(a)–2(c), to represent different density levels for the underlying graph. To form set $\mathcal{S}$ with network model $\Xi$ , for each structure, we generate supply/demand vectors in such a way that the underlying network problem is feasible. In the following, we give details about the data generation. For $\Delta$ , we consider two scenarios for each structure, one for the case with $m=1$ , and the other for the case with $m=2$ . The former case is fundamental as it can be always used as a basic relaxation even when there are multiple $y$ variables in the model. The latter case is of particular importance among instances of $\mathcal{S}$ with multiple $y$ variables in the simplex $\Delta$ , as it represents the pairwise conflict between variables. Specifically, when two binary variables $y_{i}$ and $y_{j}$ cannot be positive at the same time, which can also be modeled as complemetarity constraints of the form $y_{i}y_{j}=0$ for each $i,j$ that belong to a so-called conflict graph defined on $y$ variables, we may use such formulations with $m=2$ .

As the base relaxation for set $\mathcal{S}$ , we consider its linearized LP relaxation, where $y$ variables are continuous, and the bilinear constraints $y_{j}x_{i}-z_{k}=0$ , for all $k\in K$ and $(i,j)\in N\times M$ such that $A^{k}_{ji}=1$ , are replaced with the well-know McCormick bounds $z_{k}\geq 0$ , $z_{k}\geq u_{i}y_{j}+x_{i}-u_{i}$ , $z_{k}\leq x_{i}$ , and $z_{k}\leq u_{i}y_{j}$ . We denote this set by $\mathcal{S}_{L}$ . To evaluate the impact of adding EC&R cuts on strengthening $\mathcal{S}_{L}$ , we optimize a linear function in $x$ and $z$ variables over this set:

[TABLE]

which provides an LP relaxation for the original problem

[TABLE]

We use the special tree structure of Section 3.1 for the case with $m=1$ , and the special forest structure of Section 3.2 to for the case with $m=2$ to produce EC&R cutting planes that can be added to $\mathcal{S}_{L}$ to improve the dual bound in the optimization problem. Our experiments show the effectiveness of the proposed EC&R inequality in improving the classical McCormick bounds. While the EC&R cuts that we obtain are valid for both cases where $y$ variables are binary and continues, we consider the binary case in these computational experiments so that we can obtain the optimal value of the original problem (33) and use it to compare the bound improvement achieved from adding the EC&R cuts to (32). This statement follows from the fact that the McCormick formulation is an exact reformulation of the original problem when $y$ variables are binary. For these experiments, the codes are written in Python 3.7.8. and the optimization problems are solved using CPLEX 20.1.0 at its default settings.

4.1 Bipartite Structure.

In this section, we perform computational experiments on bipartite structures; see Figure 2(a). This structure represents a mid-level density for the underlying graph. We consider three problem sizes, where the number of nodes in the underlying network is $10$ , $30$ , and $50$ . For each problem size, we create $10$ randomly generated instances for set $\mathcal{S}$ with the following specifications for its underlying network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . Because the network is bipartite, we consider half the nodes in the left partition and half in the right partition, and there is a directed arc from each node in the left partition to each node in the right partition. For problems with sizes $10$ , $30$ , and $50$ , each arc is assigned a capacity that is randomly generated from a discrete uniform distribution between $(0,100)$ , $(0,100)$ , and $(0,200)$ , respectively. The coefficients of the $x$ variables in the objective function are randomly generated from a discrete uniform distribution from $(0,20)$ , $(0,20)$ , and $(0,30)$ , respectively, for problem sizes $10$ , $30$ , and $50$ . For all instances, the coefficient of $z$ variables in the objective function is randomly generated from a uniform interval $(-10,10)$ , and the supply/demand at each node is selected randomly from $(-200,200)$ in such a way that the supply and demand are balanced.

As noted earlier, we consider two cases for the number $m$ of $y$ variables in the simplex. Table 1 shows the results for the case with $m=1$ . The first column contains the problem size, and the second column shows the instance number. The third column represents the optimal value of (33) obtained by setting the $y$ variable as binary in (32). The fourth column shows the optimal value of the linearized LP relaxation of (33) given in (32). The next two columns under “Full EC&R ” show the result of adding all violated EC&R inequalities obtained from tree structures according to Theorem 3.8 with up to two cancellations. These cuts are added in loops after the LP relaxation is solved to separate the current optimal solution until the improvement in the optimal value is less than 1% for problems with sizes 10 and 30, and 2% for problems with size 50 due to its higher computational cost. To find the most violated EC&R inequalities produced from an EC&R assignment, we use the technique in Remark 3.5. The column “Gap” contains the gap improvement obtained by adding these EC&R inequalities over the optimal value of (32). The next column shows the total solution time to add these inequalities. The column “Gap” under “Separation EC&R ” includes the result of adding the above EC&R inequalities through a separation oracle according to that discussed at the end of Section 3. In particular, for a current optimal solution $(\bar{\bm{x}};\bar{\bm{y}};\bar{\bm{z}})$ , we consider the EC&R assignment class and sign associated with the 30 largest values for $\Psi_{k}=|\bar{y}_{j}\bar{x}_{i}-\bar{z}_{k}|$ for all $(i,j,k)$ such that $A^{k}_{ji}=1$ . We add the resulting EC&R inequalities in loops as discussed above. The last column shows the solution time when using this separation method. The last row for each problem size reports the average values over the 10 random instances. The results in Table 1 have three important implications. First, they show the effectiveness of our proposed EC&R inequalities based on the tree structures in improving the gap closure and strengthening the classical McCormick relaxation. Second, they confirm the general observation about EC&R inequalities with fewer aggregations (up to three in these experiments) tend to be the most effective, as they account for above 90% of the total gap closure for most instances. Third, they demonstrate the effectiveness of the proposed separation method which achieves similar gap improvement levels in much smaller time compared to the case without separation. These observations show promise for an efficient implementation of the EC&R technique to solve practical problems.

In Table 2, we consider the case for $\mathcal{S}$ where $m=2$ . The first and second columns contain the problem size and instance number, respectively. The third and fourth columns show the optimal value of the original problem and its LP relaxation, respectively. The next four columns under “Tree EC&R ” show the result of adding EC&R inequalities with up to three aggregations (two cancellations) obtained from the one-variable relaxations of set $\mathcal{S}$ where only one $y$ variable is considered. For this approach, we use the EC&R results of Theorem 3.8 to identify the tree structures for each one-variable relaxation and add the resulting cutting planes for each relaxation separately through loops as previously described. The subcolumns under the “Full” and “Separation” headers contain the gap closure and the solution time to add these inequalities without and with the aforementioned separation oracle, respectively. For these instances, we consider 60 of the largest values for $\Psi_{k}$ in the separation procedure. The next four columns under “Forest EC&R ” include the result of adding EC&R inequalities with up to three aggregations obtained for $\mathcal{S}$ where both $y$ variables are considered in their original simplex. To add EC&R cutting planes, we consider the forest structures according to Theorem 3.24. The subcolumns under the “Full” and “Separation” headers contain the gap closure and the solution time to implement these cutting planes without and with the separation oracle, respectively. It is evident from these results that the EC&R cuts obtained from the forest structure that involve all $y$ variables outperform those obtained from the tree structures that consider $y$ variables individually, showing the effectiveness of the class of EC&R inequalities with the pairwise cancellation property. Further, these results show the notable impact of using a separation oracle to produce the EC&R inequalities on reducing the solution time, especially for larger size problems where the time save is of orders of magnitude.

4.2 Clique Structure.

In this section, we perform computational experiments on clique structures; see Figure 2(b). This structure represents a high-level density for the underlying graph. We consider three problem sizes, where the number of nodes in the underlying network is $10$ , $20$ , and $30$ . For each problem size, we create $10$ randomly generated instances for set $\mathcal{S}$ with the following specifications for its underlying network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . For the clique networks, there is an arc between each pair of nodes. We determine the direction of each arc through a random binary value. Each arc is assigned a capacity that is randomly generated from a discrete uniform distribution between $(0,100)$ . The coefficient of the $z$ variables in the objective function is randomly generated from a uniform distribution between $(-10,10)$ , added by a random number between [math] and $0.1$ that is multiplied by the number of nodes in that network. This second random element is added to widen the optimality gap for a meaningful comparison of the different approaches. The coefficient for $x$ variables in the objective function is randomly generated from a discrete uniform distribution between $(0,20)$ , and the supply/demand at each node is selected randomly from $(-200,200)$ in such a way that supply and demands are balanced.

The results for the clique structure are reported in Tables 3 and 4 for the cases with $m=1$ and $m=2$ , respectively. The definition of columns in each table is similar to that of Tables 1 and 2. For the clique structures, we observe similar patterns to those of the bipartite case in the gap improvement and solution time of the different approaches.

4.3 Cycle Structure.

In this section, we perform computational experiments on cycle structures; see Figure 2(c). This structure represents a low-level density for the underlying graph. We consider three problem sizes, where the number of nodes in the underlying network is $50$ , $100$ , and $200$ . For each problem size, we create $10$ randomly generated instances for set $\mathcal{S}$ with the following specifications for its underlying network $\mathrm{G}=(\mathrm{V},\mathrm{A})$ . Each node in the cycle is either a supply or a demand node. The adjacent nodes to a supply node are both demand nodes, and the adjacent nodes to a demand node are both supply nodes. The direction of each arc in the cycle is from the supply node to the demand node incident to that arc. The coefficient of the $z$ variables in the objective function is randomly generated from a uniform distribution between $(-18,22)$ . The coefficient for $x$ variables in the objective function is randomly generated from a discrete uniform distribution between $(0,20)$ . The supply for the supply nodes is generated randomly from a discrete uniform distribution between $(100,200)$ . The demand for the demand nodes is generated randomly from a discrete uniform distribution between $(0,100)$ . We note here that this model is not balanced, unlike the network structures studied in the previous sections, because a balanced model will have a single solution only. As a result, we change the equality flow-balance constraints to inequalities to account for an unbalanced model. We set the capacity of each arc equal to $150$ to ensure that the problem is feasible. The computational results are given in Tables 5 and 6 for the cases with $m=1$ and $m=2$ , respectively. The columns in these tables are defined similarly to the previous tables, with a difference that the separation columns are omitted because implementing the full EC&R approaches is fast, hence a separation oracle would not be necessary. This fast performance can be attributed to the sparsity of the underlying graph that allows for generating a complete set of EC&R inequalities in a short amount of time since the aggregation options to obtain the desired cancellations is limited. As a result, we can achieve high gap improvements for larger size problems as evidenced in Tables 5 and 6.

5 Conclusion

We study a bipartite bilinear set, where the variables in one partition belong to a network flow model, and the variables in the other partition belong to a simplex. We design a convexification technique based on aggregation of side constraints with appropriate weights, which produces an important class of facet-defining inequalities for the convex hull of the bilinear set, which describes the convex hull for the special case where the simplex contains a single variable. We show that each such inequality can be obtained by considering the constraints corresponding to the nodes of the underlying network that form a special tree or forest structure. This property leads to an explicit derivation of strong inequalities through identifying special graphical structures in the network model. These inequalities can be added to the classical McCormick relaxation to strengthen the relaxation and improve the dual bounds, as corroborated in the preliminary computational experiments conducted on various basic network structures.

Bibliography21

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Al-Khayyal and Falk [1983] Al-Khayyal FA, Falk JE (1983) Jointly constrained biconvex programming. Mathematics of Operations Research 8:273–286.
2Balas [1998] Balas E (1998) Disjunctive programming: Properties of the convex hull of the feasible points. Discrete Applied Mathematics 89:3–44.
3Beale and Forrest [1976] Beale EML, Forrest JJH (1976) Global optimization using special ordered sets. Mathematical Programming 10:52–69.
4Ben-Ayed and Blair [1990] Ben-Ayed O, Blair CE (1990) Computational difficulties of bilevel linear programming. Operations Research 38:556–560.
5Boland et al. [2017] Boland N, Dey SS, Kalinowski T, Molinaro M, Rigterink F (2017) Bounding the gap between the mccormick relaxation and the convex hull for bilinear functions. Mathematical Programming 162:523–532.
6Chiou [2005] Chiou SW (2005) Bilevel programming for the continuous transport network design problem. Transportation Research Part B: Methodological 39(4):361–383.
7Davarnia [2016] Davarnia D (2016) Convexification Techniques for Bilinear and Complementarity Constraints with Application to Network Interdiction . Ph.D. thesis, University of Florida, Gainesville, FL, USA.
8Davarnia et al. [2017] Davarnia D, Richard JPP, Tawarmalani M (2017) Simultaneous convexification of bilinear functions over polytopes with application to network interdiction 27:1801–1833.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

1 Introduction

2 Extended Cancel-and-Relax

Theorem 2.1

3 Network Polytopes

Remark 3.1

3.1 The Case with m=1m=1m=1.

Proposition 3.2

Proof 3.3

Remark 3.4

Remark 3.5

Proposition 3.6

Proof 3.7

Theorem 3.8

Proof 3.9

Example 3.10

Proposition 3.11

Proof 3.12

3.2 The Case with m>1m>1m>1.

Example 3.13

Definition 3.14

Definition 3.15

Proposition 3.16

Proof 3.17

Example 3.18

Proposition 3.19

Proof 3.20

Remark 3.21

Proposition 3.22

Proof 3.23

Theorem 3.24

Proof 3.25

Example 3.26

4 Computational Experiments

4.1 Bipartite Structure.

4.2 Clique Structure.

4.3 Cycle Structure.

5 Conclusion

3.1 The Case with $m=1$ .

3.2 The Case with $m>1$ .