Simplex QP-based methods for minimizing a conic quadratic objective over polyhedra
Alper Atamturk, Andres Gomez

TL;DR
This paper introduces simplex QP-based methods for efficiently solving conic quadratic optimization problems over polyhedra, outperforming interior point methods especially in branch-and-bound contexts for discrete problems.
Contribution
It proposes a reformulation using the perspective of the quadratic function, enabling simple coordinate descent and bisection algorithms that are suitable for warm starts and branch-and-bound algorithms.
Findings
Up to 22x faster for large convex instances
Higher precision solutions compared to interior point methods
Significant speed-ups over barrier-based and LP-based branch-and-bound algorithms
Abstract
We consider minimizing a conic quadratic objective over a polyhedron. Such problems arise in parametric value-at-risk minimization, portfolio optimization, and robust optimization with ellipsoidal objective uncertainty; and they can be solved by polynomial interior point algorithms for conic quadratic optimization. However, interior point algorithms are not well-suited for branch-and-bound algorithms for the discrete counterparts of these problems due to the lack of effective warm starts necessary for the efficient solution of convex relaxations repeatedly at the nodes of the search tree. In order to overcome this shortcoming, we reformulate the problem using the perspective of the quadratic function. The perspective reformulation lends itself to simple coordinate descent and bisection algorithms utilizing the simplex method for quadratic programming, which makes the solution methods…
| Tolerance | BAR | ALG1 | |||||
|---|---|---|---|---|---|---|---|
| optgap | time | #iter | optgap | time | #iter | #QP | |
| 29.9 | 10 | 3.2 | 835 | 4 | |||
| 41.5 | 15 | 4.2 | 844 | 6 | |||
| 54.6 | 23 | 4.3 | 844 | 8 | |||
| 62.9 | 27 | 4.7 | 844 | 10 | |||
| 66.8 | 29 | 5.2 | 844 | 12 | |||
| 69.6 | 30 | 5.4 | 844 | 13 | |||
| 72.0 | 32 | 6.0 | 844 | 15 | |||
| 74.0 | 33 | 6.2 | 844 | 17 | |||
| 75.9 | 34 | 6.6 | 844 | 19 | |||
| 78.7 | 35 | 7.0 | 844 | 21 | |||
| 79.6 | 36 | 7.4 | 844 | 23 | |||
| 89.6 | 39 | 7.8 | 844 | 25 | |||
| Method | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time | #iter | #QP | time | #iter | #QP | time | #iter | #QP | time | #iter | #QP | |||
| 100 | 0.1 | ALG1 | 1.0 | 22 | 20 | 1.1 | 53 | 24 | 1.3 | 104 | 29 | 1.4 | 123 | 26 |
| ALG2 | 0.8 | 41 | 14 | 0.9 | 95 | 15 | 0.9 | 150 | 15 | 1.1 | 219 | 16 | ||
| BAR | 4.6 | 16 | - | 4.9 | 24 | - | 5.2 | 26 | - | 5.1 | 25 | - | ||
| MOS | 2.1 | 10 | - | 2.4 | 11 | - | 2.2 | 10 | - | 2.4 | 11 | - | ||
| 100 | 0.5 | ALG1 | 1.1 | 33 | 23 | 1.1 | 69 | 24 | 1.5 | 144 | 37 | 1.6 | 192 | 30 |
| ALG2 | 0.8 | 60 | 14 | 0.9 | 125 | 15 | 0.9 | 200 | 15 | 1.1 | 251 | 16 | ||
| BAR | 4.5 | 21 | - | 5.1 | 25 | - | 5.8 | 29 | - | 5.7 | 29 | - | ||
| MOS | 2.2 | 10 | - | 2.2 | 10 | - | 2.4 | 10 | - | 2.4 | 11 | - | ||
| 200 | 0.1 | ALG1 | 0.9 | 33 | 19 | 1.1 | 73 | 25 | 1.2 | 110 | 25 | 1.4 | 157 | 26 |
| ALG2 | 0.8 | 49 | 14 | 0.9 | 126 | 14 | 0.9 | 172 | 14 | 1.1 | 259 | 15 | ||
| BAR | 4.7 | 22 | - | 4.5 | 22 | - | 5.1 | 25 | - | 5.3 | 27 | - | ||
| MOS | 2.4 | 11 | - | 2.6 | 12 | - | 2.5 | 11 | - | 2.4 | 11 | - | ||
| 200 | 0.5 | ALG1 | 1.0 | 48 | 22 | 1.1 | 99 | 22 | 1.2 | 151 | 25 | 1.6 | 218 | 24 |
| ALG2 | 0.9 | 94 | 14 | 0.9 | 179 | 14 | 1.0 | 233 | 15 | 1.3 | 326 | 15 | ||
| BAR | 4.4 | 21 | - | 4.9 | 24 | - | 5.2 | 26 | - | 5.8 | 31 | - | ||
| MOS | 2.3 | 10 | - | 2.4 | 10 | - | 2.4 | 11 | - | 2.6 | 11 | - | ||
| avg | ALG1 | 1.0 | 34 | 21 | 1.1 | 73 | 24 | 1.3 | 127 | 29 | 1.5 | 173 | 27 | |
| ALG2 | 0.8 | 61 | 14 | 0.9 | 131 | 15 | 0.9 | 189 | 15 | 1.2 | 264 | 15 | ||
| BAR | 4.3 | 20 | - | 4.9 | 24 | - | 5.3 | 27 | - | 5.5 | 28 | - | ||
| MOS | 2.3 | 10 | - | 2.4 | 11 | - | 2.4 | 11 | - | 2.4 | 11 | - | ||
| Method | ||||||||||||||
| time | #iter | #QP | time | #iter | #QP | time | #iter | #QP | time | #iter | #QP | |||
| 100 | 0.1 | ALG1 | 4.9 | 940 | 12 | 7.0 | 1,307 | 16 | 8.5 | 1,505 | 18 | 10.5 | 1,756 | 21 |
| ALG2 | 5.4 | 1,283 | 11 | 7.0 | 1,637 | 13 | 8.4 | 1,865 | 14 | 9.7 | 2,375 | 13 | ||
| BAR | 81.1 | 26 | - | 64.3 | 21 | - | 55.3 | 16 | - | 56.8 | 16 | - | ||
| MOS | 19.4 | 19 | - | 16.8 | 17 | - | 15.6 | 18 | - | 15.8 | 19 | - | ||
| 100 | 0.5 | ALG1 | 5.2 | 902 | 14 | 8.2 | 1,191 | 21 | 9.3 | 1,391 | 21 | 11.0 | 1,641 | 21 |
| ALG2 | 5.5 | 1,148 | 12 | 7.2 | 1,474 | 13 | 8.6 | 1,772 | 14 | 9.6 | 2,020 | 14 | ||
| BAR | 62.7 | 19 | - | 56.0 | 16 | - | 57.1 | 16 | - | 57.7 | 16 | - | ||
| MOS | 17.4 | 17 | - | 17.8 | 18 | - | 15.7 | 19 | - | 14.9 | 17 | - | ||
| 200 | 0.1 | ALG1 | 4.9 | 836 | 14 | 6.3 | 1,053 | 15 | 8.3 | 1,220 | 18 | 14.4 | 1,429 | 17 |
| ALG2 | 4.9 | 932 | 12 | 6.8 | 1,377 | 13 | 8.4 | 1,671 | 13 | 12.4 | 1,833 | 13 | ||
| BAR | 76.8 | 25 | - | 60.1 | 18 | - | 66.0 | 20 | - | 128.3 | 21 | - | ||
| MOS | 16.2 | 17 | - | 16.1 | 18 | - | 15.5 | 18 | - | 15.2 | 18 | - | ||
| 200 | 0.5 | ALG1 | 4.5 | 858 | 12 | 6.2 | 1,048 | 15 | 7.6 | 1,237 | 16 | 12.7 | 1,387 | 18 |
| ALG2 | 4.9 | 978 | 12 | 6.8 | 1,363 | 13 | 8.5 | 1,626 | 13 | 15.9 | 1,794 | 14 | ||
| BAR | 83.1 | 26 | - | 72.7 | 21 | - | 64.6 | 18 | - | 101.2 | 16 | - | ||
| MOS | 18.1 | 19 | - | 16.8 | 20 | - | 16.9 | 20 | - | 16.2 | 19 | - | ||
| avg | ALG1 | 4.9 | 884 | 13 | 6.9 | 1,150 | 17 | 8.4 | 1,338 | 18 | 12.1 | 1,553 | 20 | |
| ALG2 | 5.2 | 1,086 | 12 | 6.9 | 1,463 | 13 | 8.5 | 1,734 | 13 | 11.9 | 2,005 | 14 | ||
| BAR | 75.9 | 24 | - | 63.2 | 19 | - | 60.7 | 17 | - | 86.0 | 17 | - | ||
| MOS | 17.8 | 18 | - | 16.8 | 18 | - | 15.9 | 19 | - | 15.5 | 18 | - | ||
| Method | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time | #iter | #QP | time | #iter | #QP | time | #iter | #QP | time | #iter | #QP | |
| ALG1 | 0.2 | 43 | 20 | 0.6 | 65 | 19 | 2.8 | 75 | 25 | 11.7 | 104 | 25 |
| ALG2 | 0.2 | 73 | 14 | 0.5 | 116 | 14 | 2.2 | 129 | 15 | 9.1 | 175 | 15 |
| BAR | 0.3 | 21 | - | 2.4 | 22 | - | 22.1 | 27 | - | 204.9 | 30 | - |
| MOS | 0.2 | 9 | - | 1.2 | 11 | - | 7.3 | 12 | - | 50.4 | 12 | - |
| Method | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time | nodes | egap | #s | time | nodes | egap | #s | time | nodes | egap | #s | time | nodes | egap | #s | |||
| 100 | 0.1 | BBA1 | 1 | 156 | 0.0 | 5 | 29 | 3,271 | 0.0 | 5 | 685 | 68,318 | 0.0 | 5 | 3,644 | 272,527 | 0.1 | 4 |
| BBA2 | 3 | 156 | 0.0 | 5 | 83 | 3,271 | 0.0 | 5 | 1,823 | 68,317 | 0.0 | 5 | 6,218 | 172,489 | 0.3 | 2 | ||
| BBBR | 16 | 156 | 0.0 | 5 | 349 | 3,270 | 0.0 | 5 | 4,664 | 43,695 | 0.1 | 3 | 7,200 | 23,324 | 1.2 | 0 | ||
| CXBR | 35 | 276 | 0.0 | 5 | 513 | 3,497 | 0.0 | 5 | 5,260 | 32,782 | 0.2 | 2 | 7,200 | 17,169 | 1.3 | 0 | ||
| CXLP | 34 | 9,562 | 0.0 | 5 | 7,200 | 209,576 | 0.7 | 0 | 7,200 | 244,911 | 2.2 | 0 | 7,200 | 265,485 | 3.8 | 0 | ||
| CXLPE | 2 | 374 | 0.0 | 5 | 91 | 7,640 | 0.0 | 5 | 2,788 | 111,293 | 0.0 | 5 | 6,983 | 191,065 | 0.6 | 1 | ||
| CXD | 4 | 368 | 0.0 | 5 | 42 | 5,152 | 0.0 | 5 | 640 | 58,076 | 0.0 | 5 | 3,778 | 183,816 | 0.1 | 4 | ||
| 100 | 0.5 | BBA1 | 1 | 87 | 0.0 | 5 | 59 | 6,274 | 0.0 | 5 | 1,469 | 140,874 | 0.0 | 5 | 6,328 | 447,989 | 0.4 | 2 |
| BBA2 | 2 | 87 | 0.0 | 5 | 160 | 6,274 | 0.0 | 5 | 4,024 | 139,154 | 0.0 | 4 | 7,200 | 223,921 | 0.7 | 0 | ||
| BBBR | 10 | 87 | 0.0 | 5 | 686 | 6,274 | 0.0 | 5 | 6,134 | 56,394 | 0.3 | 1 | 7,200 | 21,111 | 1.7 | 0 | ||
| CXBR | 24 | 183 | 0.0 | 5 | 1,027 | 6,734 | 0.0 | 5 | 6,399 | 39,710 | 0.4 | 1 | 7,200 | 20,101 | 2.0 | 0 | ||
| CXLP | 294 | 26,957 | 0.0 | 5 | 7,200 | 229,641 | 0.8 | 0 | 7,200 | 263,810 | 2.3 | 0 | 7,200 | 244,863 | 4.7 | 0 | ||
| CXLPE | 2 | 349 | 0.0 | 5 | 218 | 14,737 | 0.0 | 5 | 5,116 | 215,292 | 0.1 | 2 | 7,200 | 170,710 | 1.1 | 0 | ||
| CXD | 3 | 373 | 0.0 | 5 | 164 | 16,070 | 0.0 | 5 | 3,643 | 245,251 | 0.0 | 4 | 7,042 | 336,814 | 0.8 | 1 | ||
| 200 | 0.1 | BBA1 | 1 | 247 | 0.0 | 5 | 23 | 3,259 | 0.0 | 5 | 637 | 55,248 | 0.0 | 5 | 3,761 | 344,662 | 0.2 | 4 |
| BBA2 | 5 | 247 | 0.0 | 5 | 79 | 3,259 | 0.0 | 5 | 2,083 | 55,248 | 0.0 | 5 | 6,975 | 242,548 | 0.4 | 2 | ||
| BBBR | 24 | 247 | 0.0 | 5 | 321 | 3,259 | 0.0 | 5 | 4,573 | 39,647 | 0.1 | 3 | 7,200 | 17,017 | 1.4 | 0 | ||
| CXBR | 52 | 460 | 0.0 | 5 | 540 | 3,711 | 0.0 | 5 | 5,295 | 34,090 | 0.2 | 2 | 7,200 | 13,490 | 1.5 | 0 | ||
| CXLP | 221 | 17,205 | 0.0 | 5 | 7,200 | 208,874 | 0.6 | 0 | 7,200 | 230,304 | 2.0 | 0 | 7,200 | 186,490 | 4.2 | 0 | ||
| CXLPE | 4 | 473 | 0.0 | 5 | 139 | 6,064 | 0.0 | 5 | 4,073 | 111,205 | 0.1 | 3 | 7,200 | 158,866 | 0.9 | 0 | ||
| CXD | 5 | 360 | 0.0 | 5 | 60 | 6,413 | 0.0 | 5 | 1,410 | 67,577 | 0.0 | 5 | 7,044 | 349,653 | 0.5 | 1 | ||
| 200 | 0.5 | BBA1 | 4 | 674 | 0.0 | 5 | 194 | 24,636 | 0.0 | 5 | 1,674 | 156,632 | 0.0 | 5 | 5,778 | 526,215 | 0.3 | 2 |
| BBA2 | 15 | 674 | 0.0 | 5 | 633 | 24,635 | 0.0 | 5 | 3,446 | 98,028 | 0.1 | 4 | 7,200 | 259,040 | 0.6 | 0 | ||
| BBBR | 77 | 674 | 0.0 | 5 | 2,106 | 17,743 | 0.0 | 4 | 5,590 | 47,725 | 0.2 | 2 | 7,200 | 20,422 | 1.5 | 0 | ||
| CXBR | 104 | 680 | 0.0 | 5 | 2,452 | 15,816 | 0.0 | 4 | 6,127 | 38,973 | 0.3 | 1 | 7,200 | 14,955 | 1.7 | 0 | ||
| CXLP | 3,514 | 120,007 | 0.1 | 4 | 7,200 | 212,082 | 1.0 | 0 | 7,200 | 240,445 | 2.3 | 0 | 7,200 | 195,841 | 4.8 | 0 | ||
| CXLPE | 21 | 1,461 | 0.0 | 5 | 1,739 | 61,593 | 0.0 | 4 | 5,435 | 163,105 | 0.2 | 2 | 7200 | 197,287 | 1.1 | 0 | ||
| CXD | 18 | 1,612 | 0.0 | 5 | 1,211 | 75,098 | 0.0 | 5 | 5,017 | 245,412 | 0.2 | 2 | 7,200 | 319,645 | 1.0 | 0 | ||
| avg | BBA1 | 2 | 291 | 0.0 | 20 | 76 | 9,360 | 0.0 | 20 | 1,116 | 105,268 | 0.0 | 20 | 4,878 | 397,848 | 0.3 | 12 | |
| BBA2 | 6 | 291 | 0.0 | 20 | 239 | 9,360 | 0.0 | 20 | 2,844 | 90,187 | 0.0 | 18 | 6,898 | 224,500 | 0.5 | 4 | ||
| BBBR | 32 | 291 | 0.0 | 20 | 865 | 7,637 | 0.0 | 19 | 5,240 | 46,865 | 0.2 | 9 | 7,200 | 20,469 | 1.4 | 0 | ||
| CXBR | 54 | 400 | 0.0 | 20 | 1,133 | 7,440 | 0.0 | 19 | 5,770 | 36,389 | 0.3 | 6 | 7,200 | 16,429 | 1.6 | 0 | ||
| CXLP | 1,016 | 43,433 | 0.0 | 19 | 7,200 | 215,043 | 0.8 | 0 | 7,200 | 244,867 | 2.2 | 0 | 7,200 | 223,170 | 4.4 | 0 | ||
| CXLPE | 7 | 664 | 0.0 | 20 | 547 | 20,800 | 0.0 | 19 | 4,353 | 139,632 | 0.1 | 12 | 7,146 | 179,482 | 0.9 | 1 | ||
| CXD | 7 | 678 | 0.0 | 20 | 369 | 25,683 | 0.0 | 20 | 2,677 | 151,588 | 0.1 | 16 | 6,267 | 297,482 | 0.6 | 6 | ||
| Method | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| time | nodes | egap | #s | time | nodes | egap | #s | time | nodes | egap | #s | time | nodes | egap | #s | |||
| 100 | 0.1 | BBA1 | 287 | 145 | 0.0 | 5 | 4,511 | 1,774 | 0.0 | 5 | 7,200 | 2,720 | 5.9 | 0 | 7,200 | 2,360 | 13.9 | 0 |
| BBA2 | 619 | 142 | 0.0 | 5 | 6,871 | 1,105 | 1.1 | 1 | 7,200 | 1,278 | 7.9 | 0 | 7,200 | 974 | 17.2 | 0 | ||
| BBBR | 3,577 | 91 | 0.2 | 4 | 7,200 | 184 | 4.9 | 0 | 7,200 | 236 | 11.7 | 0 | 7,200 | 61 | 37.2 | 0 | ||
| CXBR | 7,200 | 67 | 20.8 | 0 | 7,200 | 79 | 0 | 7,200 | 129 | 0 | 7,200 | 13 | 0 | |||||
| CXLP | 533 | 2,428 | 0.0 | 5 | 7,200 | 24,776 | 4.5 | 0 | 7,200 | 20,099 | 15.0 | 0 | 7,200 | 8,971 | 30.2 | 0 | ||
| CXLPE | 802 | 315 | 0.0 | 5 | 6,726 | 1,967 | 2.9 | 1 | 7,200 | 2,585 | 23.5 | 0 | 7,200 | 2,377 | 45.1 | 0 | ||
| CXD | 1,466 | 164 | 0.0 | 5 | 4,655 | 1,176 | 0.0 | 5 | 7,200 | 2,428 | 7.4 | 0 | 7,200 | 2,047 | 16.6 | 0 | ||
| 100 | 0.5 | BBA1 | 625 | 353 | 0.0 | 5 | 5,424 | 1,904 | 0.6 | 2 | 6,512 | 2,725 | 4.8 | 1 | 7,200 | 2,833 | 12.5 | 0 |
| BBA2 | 1,457 | 362 | 0.0 | 5 | 6,286 | 971 | 1.6 | 1 | 7,200 | 1,369 | 7.3 | 0 | 7,200 | 1,296 | 16.2 | 0 | ||
| BBBR | 6,071 | 134 | 0.7 | 2 | 7,200 | 175 | 4.9 | 0 | 7,200 | 162 | 11.5 | 0 | 7,200 | 28 | 0 | |||
| CXBR | 7,200 | 24 | 0 | 7,200 | 56 | 0 | 7,200 | 70 | 0 | 7,200 | 13 | 0 | ||||||
| CXLP | 1,132 | 6,187 | 0.0 | 5 | 7,200 | 23,671 | 4.4 | 0 | 7,200 | 16,851 | 12.8 | 0 | 7,200 | 8,180 | 23.8 | 0 | ||
| CXLPE | 967 | 607 | 0.0 | 5 | 6,420 | 2,077 | 3.4 | 1 | 7,200 | 3,070 | 16.2 | 0 | 7,200 | 3,036 | 42.3 | 0 | ||
| CXD | 1,645 | 267 | 0.0 | 5 | 6,421 | 1,931 | 0.5 | 3 | 7,200 | 2,659 | 5.6 | 0 | 7,200 | 4,166 | 12.8 | 0 | ||
| 200 | 0.1 | BBA1 | 155 | 77 | 0.0 | 5 | 2,392 | 1,075 | 0.0 | 5 | 6,434 | 3,380 | 2.6 | 1 | 7,200 | 3,245 | 10.0 | 0 |
| BBA2 | 306 | 79 | 0.0 | 5 | 4,324 | 823 | 0.3 | 3 | 7,152 | 1,469 | 4.6 | 1 | 7,200 | 1,195 | 12.7 | 0 | ||
| BBBR | 3,245 | 76 | 0.0 | 5 | 7,200 | 171 | 2.4 | 0 | 7,200 | 180 | 10.5 | 0 | 7,200 | 33 | 0 | |||
| CXBR | 7,200 | 34 | 0 | 7,200 | 45 | 0 | 7,200 | 68 | 0 | 7,200 | 11 | 0 | ||||||
| CXLP | 436 | 1,548 | 0.0 | 5 | 7,200 | 30,265 | 2.9 | 0 | 7,200 | 20,579 | 12.4 | 0 | 7,200 | 7,274 | 26.7 | 0 | ||
| CXLPE | 524 | 188 | 0.0 | 5 | 5,156 | 1,437 | 1.4 | 3 | 7,200 | 2,420 | 17.7 | 0 | 7,200 | 2,157 | 46.7 | 0 | ||
| CXD | 2,059 | 106 | 0.0 | 5 | 5,715 | 1,251 | 0.5 | 4 | 7,200 | 2,568 | 4.1 | 0 | 7,200 | 1,996 | 12.9 | 0 | ||
| 200 | 0.5 | BBA1 | 321 | 196 | 0.0 | 5 | 3,953 | 2,286 | 0.3 | 4 | 7,200 | 3,194 | 4.3 | 0 | 7,200 | 2,486 | 13.2 | 0 |
| BBA2 | 808 | 201 | 0.0 | 5 | 5,517 | 1,204 | 0.9 | 3 | 7,200 | 1,284 | 6.5 | 0 | 7,200 | 1,009 | 16.6 | 0 | ||
| BBBR | 4,826 | 113 | 0.2 | 3 | 7,200 | 173 | 3.9 | 0 | 7,200 | 176 | 12.7 | 0 | 7,200 | 54 | 43.6 | 0 | ||
| CXBR | 7,200 | 20 | 0 | 7,200 | 51 | 0 | 7,200 | 89 | 0 | 7,200 | 10 | 0 | ||||||
| CXLP | 859 | 4,989 | 0.0 | 5 | 7,200 | 28,007 | 4.4 | 0 | 7,200 | 18,873 | 13.3 | 0 | 7,200 | 10,313 | 28.9 | 0 | ||
| CXLPE | 1,046 | 399 | 0.0 | 5 | 5,948 | 2,212 | 1.8 | 2 | 7,200 | 2,305 | 21.2 | 0 | 7,200 | 2,048 | 51.2 | 0 | ||
| CXD | 2,281 | 177 | 0.0 | 5 | 4,975 | 1,873 | 0.4 | 4 | 7,200 | 2,233 | 6.9 | 0 | 7,200 | 1,325 | 16.0 | 0 | ||
| avg | BBA1 | 347 | 193 | 0.0 | 20 | 4,070 | 1,760 | 0.2 | 16 | 6,837 | 3,005 | 4.4 | 2 | 7,200 | 2,731 | 12.4 | 0 | |
| BBA2 | 798 | 196 | 0.0 | 20 | 5,750 | 1,026 | 1.0 | 8 | 7,189 | 1,350 | 6.6 | 1 | 7,200 | 1,119 | 15.7 | 0 | ||
| BBBR | 4,430 | 103 | 0.3 | 14 | 7,200 | 176 | 4.0 | 0 | 7,200 | 189 | 11.6 | 0 | 7,200 | 44 | 0 | |||
| CXBR | 7,200 | 36 | 0 | 7,200 | 58 | 0 | 7,200 | 89 | 0 | 7,200 | 12 | 0 | ||||||
| CXLP | 740 | 3,788 | 0.0 | 20 | 7,200 | 26,680 | 4.1 | 0 | 7,200 | 19,101 | 13.4 | 0 | 7,200 | 8,684 | 27.4 | 0 | ||
| CXLPE | 835 | 377 | 0.0 | 20 | 6,063 | 1,923 | 2.4 | 7 | 7,200 | 2,595 | 19.7 | 0 | 7,200 | 2,404 | 46.3 | 0 | ||
| CXD | 1,863 | 178 | 0.0 | 20 | 5,441 | 1,558 | 0.3 | 16 | 7,200 | 2,472 | 6.0 | 0 | 7,200 | 2,383 | 14.6 | 0 | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Simplex QP-based methods for minimizing
a conic quadratic objective over polyhedra
Alper Atamtürk and Andrés Gómez
Abstract.
We consider minimizing a conic quadratic objective over a polyhedron. Such problems arise in parametric value-at-risk minimization, portfolio optimization, and robust optimization with ellipsoidal objective uncertainty; and they can be solved by polynomial interior point algorithms for conic quadratic optimization. However, interior point algorithms are not well-suited for branch-and-bound algorithms for the discrete counterparts of these problems due to the lack of effective warm starts necessary for the efficient solution of convex relaxations repeatedly at the nodes of the search tree.
In order to overcome this shortcoming, we reformulate the problem using the perspective of the quadratic function. The perspective reformulation lends itself to simple coordinate descent and bisection algorithms utilizing the simplex method for quadratic programming, which makes the solution methods amenable to warm starts and suitable for branch-and-bound algorithms. We test the simplex-based quadratic programming algorithms to solve convex as well as discrete instances and compare them with the state-of-the-art approaches. The computational experiments indicate that the proposed algorithms scale much better than interior point algorithms and return higher precision solutions. In our experiments, for large convex instances, they provide up to 22x speed-up. For smaller discrete instances, the speed-up is about 13x over a barrier-based branch-and-bound algorithm and 6x over the LP-based branch-and-bound algorithm with extended formulations.
Keywords: Simplex method, conic quadratic optimization, quadratic programming, warm starts, value-at-risk minimization, portfolio optimization, robust optimization.
A. Atamtürk: Industrial Engineering & Operations Research, University of California, Berkeley, CA 94720-1777. [email protected]
A. Gómez: Department of Industrial Engineering, Swanson School of Engineering, University of Pittsburgh, PA 15261-3077. [email protected]
May 2017; May 2018
1. Introduction
Consider the minimization of a conic quadratic function over a polyhedron, i.e.,
[TABLE]
where is a symmetric positive semidefinite matrix, , and is a rational polyhedron. We denote by CDO the discrete counterpart of CO with integrality restrictions: . CO and CDO are frequently used to model utility with uncertain objectives as in parametric value-at-risk minimization [25], portfolio optimization [5], and robust counterparts of linear programs with an ellipsoidal objective uncertainty set [13, 14, 16].
Note that CO includes linear programming (LP) and convex quadratic programming (QP) as special cases. The simplex method [22, 41, 39] is still the most widely used algorithm for LP and QP, despite the fact that polynomial interior point algorithms [28, 34, 32] are competitive with the simplex method in many large-scale instances. Even though non-polynomial, the simplex method has some distinct advantages over interior point methods. Since the simplex method iterates over bases, it is possible to carry out the computations with high accuracy and little cost, while interior point methods come with a trade-off between precision and efficiency. Moreover, an optimal basis returned by the simplex method is useful for sensitivity analysis, while interior point methods do not produce such a basis unless an additional “crashing” procedure is performed [e.g. 31]. Finally, if the parameters of the problem change, re-optimization can often be done very fast with the simplex method starting from a primal or dual feasible basis, whereas warm starts with interior point methods have limitations [42, 21]. In particular, fast re-optimization with the dual simplex method is crucial when solving discrete optimization problems with a branch-and-bound algorithm.
CO is a special case of conic quadratic optimization [30, 3], which can be solved by polynomial-time interior points algorithms [2, 35, 15]. Although CO can be solved by a general conic quadratic solver, we show in this paper that iterative QP algorithms scale much better. In particular, simplex-based QP algorithms allowing warm starts perform much faster than interior point methods for CO.
For the discrete counterpart CDO, a number of different approaches are available for the special case with a diagonal matrix: Ishii et al. [27] give a polynomial time for optimization over spanning trees; Bertsimas and Sim [17] propose an approximation algorithm that solves series of linear integer programs; Atamtürk and Narayanan [7] give a cutting plane algorithm utilizing the submodularity of the objective for the binary case; Atamtürk and Goméz [4] give nonlinear cuts for the mixed 0-1 case; Atamtürk and Narayanan [8] give a parametric algorithm for the binary case with a cardinality constraint. Maximization of the same objective over the binaries is -hard [1].
The aforementioned approaches do not extend to the non-diagonal case or to general feasible regions, which are obviously -hard as quadratic and linear integer optimization are special cases. The branch-and-bound algorithm is the method of choice for general CDO. However, branch-and-bound algorithms that repeatedly employ a nonlinear programming (NLP) solver at the nodes of the search tree are typically hampered by the lack of effective warm starts. Borchers and Mitchell [20] and Leyffer [29] describe NLP-based branch-and-bound algorithms, and they give methods that branch without solving the NLPs to optimality, reducing the computational burden for the node relaxations. On the other hand, LP-based branch-and-bound approaches employ linear outer approximations of the nonlinear terms. This generally results in weaker relaxations at the nodes, compared to the NLP approaches, but allows one to utilize warm starts with the simplex method. Therefore, one is faced with a trade-off between the strength of the node relaxations and the solve time per node. A key idea to strengthen the node relaxations, as noted by Tawarmalani and Sahinidis [37], is to use extended formulations. Atamtürk and Narayanan [6] describe mixed-integer rounding inequalities in an extended formulation for conic quadratic integer programming. Vielma et al. [40] use an extended formulation for conic quadratic optimization that can be refined during branch-and-bound, and show that an LP-based branch-and-bound using the extended formulations typically outperforms the NLP-based branch-and-bound algorithms. The reader is referred to Belotti et al. [12] for an excellent survey of the solution methods for mixed-integer nonlinear optimization.
In this paper, we reformulate CO through the perspective of the quadratic term and give algorithms that solve a sequence of closely related QPs. Utilizing the simplex method, the solution to each QP is used to warm start the next one in the sequence, resulting in a small number of simplex iterations and fast solution times. Moreover, we show how to incorporate the proposed approach in a branch-and-bound algorithm, efficiently solving the continuous relaxations to optimality at each node and employing warm starts with the dual simplex method. Our computational experiments indicate that the proposed approach outperforms the state-of-the-art algorithms for convex as well as discrete cases.
The rest of the paper is organized as follows. In Section 2 we give an alternative formulation for CO using the perspective function of the quadratic function. In Section 3 we present coordinate descent and accelerated bisection algorithms that solve a sequence of QPs. In Section 4 we provide computational experiments, comparing the proposed methods with state-of-the-art barrier and other algorithms.
2. Formulation
In this section we present a reformulation of CO using the perspective function of the quadratic term. Let be the feasible region of problem CO. For convex quadratic , consider the function defined as
[TABLE]
Observe that
[TABLE]
where
[TABLE]
We will show that problems CO and PO have, in fact, the same optimal objective value and that there is a one-to-one correspondence between the optimal primal-dual pairs of both problems.
Proposition 1**.**
Problem PO is a convex optimization problem.
Proof.
It suffices to observe that is the closure of the perspective function of the convex quadratic function , and is therefore convex [e.g. 26, p. 160]. Since all other objective terms and constraints of PO are linear, PO is a convex optimization problem. ∎
Proposition 2**.**
Problems CO and PO are equivalent.
Proof.
If , the objective function of problem PO is continuous and differentiable, and since the feasible region is a polyhedron and the problem is convex, its KKT points are equivalent to its optimal solutions. The KKT conditions of PO are
[TABLE]
where and are the dual variables associated with constraints and , respectively. Note that and (2) imply that . Substituting in (1), one arrives at the equivalent conditions
[TABLE]
Ignoring the redundant variable and equation (4), we see that these are the KKT conditions of problem CO. Therefore, any optimal primal-dual pair for PO with is an optimal primal-dual pair for CO. Similarly, we see that any optimal primal-dual pair of problem CO with gives an optimal primal-dual pair of problem PO by setting . In both cases, the objective values match.
On the other hand, if , then PO reduces to problem
[TABLE]
which corresponds to CO with , and hence they are equivalent. ∎
Proposition 2 indeed holds for more general problems; it is not necessary to have a polyhedral feasible set [9]. Since they are equivalent optimization problems, we can use PO to solve CO. In particular, we exploit the fact that, for a fixed value of , PO reduces to a QP.
3. Algorithms
For simplicity, assume that PO has an optimal solution; hence, is nonempty and may be assumed to be bounded. Consider the one-dimensional optimal value function
[TABLE]
As is nonempty and bounded, is real-valued and, by Proposition 1, it is convex. Throughout, denotes an optimal solution to (5).
In this section we describe two algorithms for PO that utilize a QP oracle. The first one is a coordinate descent approach, whereas, the second one is an accelerated bisection search algorithm on the function . Finally, we discuss how to exploit the warm starts with the simplex method to solve convex as well as discrete cases.
3.1. Coordinate descent algorithm
Algorithm 1 successively optimizes over for a fixed value of , and then optimizes over for a fixed value of . Observe that the optimization problem in line 6 over is a QP, and the optimization in line 7 over has a closed form solution: by simply setting the derivative to zero, we find that .
First observe that the sequence of objective values is non-increasing. Moreover, the dual feasibility KKT conditions for the QPs in line 6 are of the form
[TABLE]
Let be a norm and suppose that the QP oracle finds feasible primal-dual pairs with tolerance with respect to . In particular in line 6 violates (6) by at most , i.e.,
[TABLE]
Proposition 3 below states that, at each iteration of Algorithm 1, we can bound the violation of the dual feasibility condition (3) corresponding to the original problem CO. The bound depends only on the precision of the QP oracle , the relative change of in the last iteration , where , and the gradient of the function evaluated at the new point .
Proposition 3** (Dual feasibility bound).**
A pair in Algorithm 1 satisfies
[TABLE]
Proof.
[TABLE]
∎
Let be a minimizer of on . We now show that the sequence of values of produced by Algorithm 1, , is monotone and bounded by .
Proposition 4** (Monotonicity).**
If , then satisfies . Similarly, if , then .
Proof.
If , then . It follows that is a minimizer of an optimization problem with a larger coefficient for the quadratic term than , and therefore , and . Moreover, the inequality follows from the convexity of the one-dimensional function and the fact that function is minimized at , and that . The case is similar. ∎
Since the sequence is bounded and monotone, it converges to a supremum or infimum. Thus is a Cauchy sequence, and . Corollaries 1 and 2 below state that Algorithm 1 converges to an optimal solution. The cases where there exists a KKT point for PO (i.e., there exists an optimal solution with ) and where there are no KKT points are handled separately.
Corollary 1** (Convergence to a KKT point).**
If PO has a KKT point, then Algorithm 1 converges to a KKT point.
Proof.
By convexity, the set of optimal solutions to (5) is an interval, . Since by assumption there exists a KKT point, we have that . The proof is by cases, depending on the value of in line 3 of Algorithm 1.
**Case : **
Since is optimal, we have by Proposition 4 that . Since and , we have that in Proposition 3, and .
**Case : **
We have by Proposition 4 than for all , . Therefore, there exists a number such that for all , and we find that .
**Case : **
We have by Proposition 4 than for all , . Therefore, there exists a number such that for all , and we find that .
Therefore, in all cases, Algorithm 1 convergences to a KKT point by Proposition 3. ∎
Corollary 2** (Convergence to [math]).**
If is the unique optimal solution to , then for any Algorithm 1 finds a solution , where and .
Proof.
The sequence converges to [math] (otherwise, by Corollary 1, it would converge to a KKT point). Thus, and all points obtained in line 6 of Algorithm 1 satisfy . ∎
We now discuss how to initialize and terminate Algorithm 1, corresponding to lines 3 and 9, respectively.
Initialization.
The algorithm may be initialized by an arbitrary . Nevertheless, when a good initial guess on the value of is available, should be set to that value. Moreover, observe that setting results in a fast computation of by solving an LP.
Stopping condition.
Proposition 3 suggests a good stopping condition for Algorithm 1. Given a desired dual feasibility tolerance of , we can stop when . Alternatively, if , then the simpler is another stopping condition. For instance, a crude upper bound on can be found by maximizing/minimizing the numerator over and minimizing over . The latter minimization is guaranteed to have a nonzero optimal value if and is positive definite.
Remark 1*.*
Observe that the stopping condition above may fail if is the unique optimal solution of (Corollary 2). This case happens only if is positive semi-definite (but not positive definite), or if is the unique optimal solution of CO. Such situations rarely arise in practice and can often be ruled out a priori. Nonetheless, stopping Algorithm 1 also when for some small ensures that the algorithm terminates (as specified in Corollary 2) even when .
3.2. Bisection algorithm
Algorithm 2 is an accelerated bisection approach to solve PO. The algorithm maintains lower and upper bounds, and , on and, at each iteration, reduces the interval by at least half. The algorithm differs from the traditional bisection search algorithm in lines 9–13, where it uses an acceleration step to reduce the interval by a larger amount: by Proposition 4, if (line 9), then , and therefore is a higher lower bound on (line 10); similarly, if , then is an lower upper bound on (lines 11 and 12). Intuitively, the algorithm takes a “coordinate descent” step as in Algorithm 1 after each bisection step. Preliminary computations show that the acceleration step reduces the number of steps as well as the overall solution time for the bisection algorithm by about 50%.
Initialization.
In line 3, can be initialized to zero and to , where is an optimal solution to the LP relaxation .
Stopping condition.
There are different possibilities for the stopping criterion in line 18. Note that if we have numbers and such that , then is a lower bound on the optimal objective value . Therefore, in line 7, a lower bound on the objective function can be computed, and the algorithm can be stopped when the gap between and is smaller than a given threshold. Alternatively, stopping when provides a guarantee on the dual infeasibility as in Proposition 3.
3.3. Warm starts
Although any QP solver can be used to run the coordinate descent and bisection algorithms described in Sections 3.1 and 3.2, simplex methods for QP are particularly effective as they allow warm starts for small changes in the model parameters in iterative applications. This is the main motivation for the QP based algorithms presented above.
3.3.1. Warm starts with primal simplex for convex optimization
All QPs solved in Algorithms 1–2 have the same feasible region and only the objective function changes in each iteration. Therefore, an optimal basis for a QP is primal feasible for the next QP solved in the sequence, and can be used to warm start a primal simplex QP solver.
3.3.2. Warm starts with dual simplex for discrete optimization
When solving discrete counterparts of CO with a branch-and-bound algorithm one is particularly interested in utilizing warm starts in solving convex relaxations at the nodes of the search tree. In a branch-and-bound algorithm, children nodes typically have a single additional bound constraint compared to the parent node.
For this purpose, it is also possible to warm start Algorithm 1 from a dual feasible basis. Let be an optimal solution to PO and be an optimal basis. Consider a new problem
[TABLE]
where the feasible set is obtained from by adding new constraints. Note that is a dual feasible basis for (7) when . Therefore, Algorithm 1 to solve problem (7) can be warm started by initializing and using as the initial basis to compute with a dual simplex algorithm. The subsequent QPs can be solved using the primal simplex algorithm as noted in Section 3.3.1.
3.4. Special cases
The simplex method is a general algorithm that can be used with any polyhedron and, as mentioned in Section 3.3, is well suited for solving the sequence of QPs. Nevertheless, for particular feasible regions, other specialized algorithms may be preferable. For instance, in the trivial unbounded case (), the QPs can be solved in closed form. Another, more interesting, case is the problem
[TABLE]
where is fixed, denotes the -norm and is a regularization parameter. Note that (8) is a special case of CO with the usual linearization of the -norm. Problem (8) arises in compressed sensing [10] and sparse linear regression [11], and is solved fast using first-order methods [33]. Note if Algorithm 1 or 2 is used instead, then every QP subproblem corresponds to the well-studied Lasso problem [38], for which efficient special algorithms exist. In particular, problem (8) can be solved with a single call to an algorithm that computes the regularization path (i.e., solves the problem for all ), such as Least Angle Regression [24].
4. Computational experiments
In this section we report on computational experiments with solving convex CO and its discrete counterpart CDO with the algorithms described in Section 3. The algorithms are implemented with CPLEX Java API. We use the simplex and barrier solvers of CPLEX version 12.6.2, as well as the barrier solver of MOSEK version 8.1.0 for the computational experiments. All experiments are conducted on a workstation with a 2.93GHz Intel®CoreTM i7 CPU and 8 GB main memory using a single thread.
4.1. Test problems
We test the algorithms on two types of data sets. For the first set the feasible region is described by a cardinality constraint and bounds, i.e., with ; problems with a cardinality constraint are common in finance [19] and statistics [18]. For the second data set the feasible region consists of the path polytope of an acyclic grid network; conic quadratic optimization over paths has been studied in [17, 36], and similar substructures arise in more complex problems such as vehicle routing [23]. For discrete optimization problems we additionally enforce the binary restrictions .
For both data sets the objective function is generated as follows: Given a rank parameter and density parameter , is the sum of a low rank factor matrix and a full rank diagonal matrix; that is, , where
- •
is an diagonal matrix with entries drawn from Uniform.
- •
where is an matrix with entries drawn from Uniform.
- •
is an matrix in which each entry is [math] with probability and drawn from Uniform with probability .
Note that the construction of matrix is consistent with factor models often used in finance. In particular, is the factor exposure matrix, is the factor covariance matrix and is the matrix of the residual variances. Each linear coefficient is drawn from Uniform.
4.2. Experiments with convex problems
In this section we present the computational results for convex instances. We compare the following algorithms:
**ALG1: **
Algorithm 1.
**ALG2: **
Algorithm 2.
**BAR: **
CPLEX barrier algorithm (the default solver in CPLEX for convex conic quadratic problems).
**MOS: **
MOSEK barrier algorithm.
For algorithms ALG1 and ALG2 we use CPLEX primal simplex algorithm as the QP solver.
Optimality tolerance
As the speed of the interior point methods crucially depends on the chosen optimality tolerance, it is prudent to first compare the speed vs the quality of the solutions for the algorithms tested. Here we study the impact of the optimality tolerance in the solution time and the quality of the solutions for CPLEX barrier algorithm BAR and simplex QP-based algorithm ALG1. The optimality tolerance of the barrier algorithm is controlled by the QCP convergence tolerance parameter (“BarQCPEpComp”), and in Algorithm 1, by the stopping condition .
In both cases, a smaller optimality tolerance corresponds to a higher quality solution. We evaluate the quality of a solution as where is the objective value of the solution found by an algorithm with a given tolerance parameter and is the objective value of the solution found by the barrier algorithm with tolerance (minimum tolerance value allowed by CPLEX). Table 1 presents the results for different tolerance values for a convex grid instance with , , and . The table shows, for varying tolerance values and for each algorithm, the quality of the solution, the solution time in seconds, the number of iterations, and QPs solved (for ALG1). We highlight in bold the default tolerance used for the rest of the experiments presented in the paper. The tolerance value for the barrier algorithm corresponds to the default parameter in CPLEX.
First observe that the solution time increases with reduced optimality tolerance for both algorithms. With lower tolerance, while the barrier algorithm performs more iterations, ALG1 solves more QPs; however, the total number of simplex iterations barely increases. For ALG1 the changes in the value of are very small between QPs, and the optimal bases of the QPs are thus the same. Therefore, using warm starts, the simplex method is able to find high precision solutions inexpensively. ALG1 achieves much higher precision an order of magnitude faster than CPLEX barrier algorithm. For the default tolerance parameters used in our computational experiments, Algorithm 1 is several orders of magnitude more precise than the barrier algorithm.
Effect of the nonlinearity parameter .
We now study the effect of changing the nonlinearity parameter . Tables 2 and 3 show the total solution time in seconds, the total number of simplex or barrier iterations, and the number of QPs solved in cardinality (1000 variables) and path instances (1760 variables), respectively. Each row represents the average over five instances for a rank () and density() configuration and algorithm used. For each parameter choice the fastest algorithm is highlighted in bold. Figure 1 also shows the total number of instances solved within the given time limit for each instance class.
Observe that, compared to CPLEX barrier algorithm, the simplex QP-based methods are 3.5 and 6 times faster for the cardinality instances and up to 15 times faster for the path instances. Additionally, the simplex QP-based methods are two to three times faster than MOSEK barrier algorithm for the cardinality instances, and up to four times faster for the path instances. Figure 1 shows that the simplex QP-based methods solve most of the instances well within the time required for MOSEK barrier algorithm to solve the easiest instance.
The barrier algorithms do not appear to be too sensitive to the nonlinearity parameter , whereas the simplex QP-based methods are faster for smaller . The number of simplex iterations in ALG1 increases with the nonlinearity parameter . Indeed, the initial problem solved by ALG1 is an LP (corresponding to ), so as increases the initial problem becomes a worse approximation, and more work is needed to converge to an optimal solution. Also note that Algorithm 2 requires fewer QPs to be solved, but as a result it benefits less from warm starts (it requires more simplex iterations per QP than ALG1). Indeed, in ALG2 the value of changes by a larger amount at each iteration (with respect to ALG1), so the objective function of two consecutive QPs changes by a larger amount. Finally, note that although their runtime is very close, the performance of ALG2 is slightly better than ALG1 overall.
Effect of the dimension
Table 4 presents a comparison of the algorithms for the convex cardinality instances with sizes 400, 800, 1600, and 3200. Each row represents the average over five instances, as before, generated with parameters , , and . Additionally, Figure 2 shows the solution time for ALG1, ALG2 and MOS as a function of the dimension ().
Observe in Table 4 that the number of QPs solved with the simplex-based algorithms does not depend on the dimension. The number of simplex iterations, however, increases with the dimension. For all algorithms perform similarly and the problems are solved very fast. However, as the dimension increases, the simplex-based algorithms outperform the barrier algorithms, often by many factors. For , the fastest simplex-based algorithm ALG2 is more than 20 times faster than CPLEX barrier algorithm, and more than five times faster than MOSEK barrier algorithm. Similar results are obtained for other parameter choices and for the path instances as well. In summary, the simplex-based algorithms scale better with the dimension, and are faster by orders of magnitude for large instances. As in Section 4.2, ALG2 slightly outperforms ALG1 for the instances considered.
4.3. Discrete instances
In this section we describe our experiments with the discrete counterpart CDO. To the best of our knowledge, as of version 12.6.2 of CPLEX, there is no documented way to embed a user-defined convex solver such as Algorithm 1 or 2 at the nodes of the CPLEX branch-and-bound algorithm. Therefore, in order to test the proposed approach for CDO, we implement a rudimentary branch-and-bound algorithm described in Appendix A. The algorithm uses a maximum infeasibility rule for branching, and does not employ presolve, cutting planes, or heuristics. We test the following configurations:
**BBA1: **
Branch-and-bound algorithm in Appendix A using Algorithm 1 as the convex solver. The first QP at each node (except the root node) is solved with CPLEX dual simplex method using the parent dual feasible basis as a warm start (as mentioned in Section 3.3) and all other QPs are solved with CPLEX primal simplex method using the basis from the parent node QP as a warm start.
**BBA2: **
Branch-and-bound algorithm in Appendix A using Algorithm 2 as the convex solver. Algorithm 2 resulted in the best performance in the continuous instances; however, unlike Algorithm 1, it cannot be naturally warm-started. Thus, in this configuration, each convex subproblem is solved without exploiting the solution from the parent node.
**BBBR: **
Branch-and-bound algorithm in Appendix A, using CPLEX barrier algorithm as the convex solver. This configuration does not use warm starts.
**CXBR: **
CPLEX branch-and-bound algorithm with barrier solver, setting the branching rule to maximum infeasibility, the node selection rule to best bound, and disabling presolve, cuts and heuristics. In this setting CPLEX branch-and-bound algorithm is as close as possible to our branch-and-bound algorithm.
**CXLP: **
CPLEX branch-and-bound algorithm with LP outer approximations, setting the branching rule to maximum infeasibility, the node selection rule to best bound, and disabling presolve, cuts and heuristics. In this setting CPLEX branch-and-bound algorithm is as close as possible to our branch-and-bound algorithm.
**CXLPE: **
CPLEX branch-and-bound algorithm with LP outer approximations, setting the branching rule to maximum infeasibility, the node selection rule to best bound, and disabling cuts and heuristic. Since presolve is activated, CPLEX uses extended formulations described in [40]. Besides presolve, all other parameters are set as in CXLP.
**CXD: **
CPLEX default branch-and-bound algorithm with LP outer approximations. This algorithm utilizes all sophisticated features of CPLEX, such as presolver, cutting planes, heuristics, advanced branching and node selection rules.
The time limit is set to two hours for each algorithm.
Table 5 presents the results for discrete cardinality instances with 200 variables and Table 6 for the discrete path instances with 1,740 variables ( grid). Each row represents the average over five instances with varying rank and density parameters, and algorithm. The tables show the solution time in seconds, the number of nodes explored in the branch-and-bound tree, the end gap after two hours as percentage, and the number of instances that are solved to optimality for varying values of . For each instance class we highlight in bold the algorithm with the best performance. Figure 3 shows, for each instance class, the total number of instances solved within given time limits for BBA1, CXLPE and CXD.
First of all, observe that the difficulty of the instances increases considerably for higher values of due to higher integrality gap. The problems corresponding to high values of the density parameter are also more challenging.
Performance of CPLEX branch-and-bound
Among CPLEX branch-and-bound algorithms, CXD is the best choice when . Configuration CXD is much more sophisticated than the other configurations, so a better performance is expected. However, note that for configuration CXD is not necessarily the best. In particular in the path instances (Table 6) CXLP and CXLPE are 2.3 times faster than CXD. This result suggests that in simple instances the additional features used by CXD (e.g. cutting planes and heuristics) may be hurting the performance.
The extended formulations result in much stronger relaxations in LP based branch-and-bound and, consequently, the number of branch-and-bound nodes required with CXLPE is only a small fraction of the number of nodes required with CXLP. However, CXLPE requires more time to solve each branch-and-bound node, due to the higher number of variables and the additional effort needed to refine the LP outer approximations. For the cardinality instances, CXLPE is definitely the better choice and is faster by orders of magnitude. For the path instances, however, CXLP is not necessarily inferior: when CXLP is competitive with CXLPE, and when CXLP performs better.
The barrier-based branch-and-bound CXBR, in general, performs poorly. For the cardinality instances, it outperforms CXLP but is slower than the other algorithms. For the path instances it has the worst performance, often struggling to find even a single feasible solution (resulting in infinite end gaps).
Performance of BBA1
Note that BBA1, BBA2 and BBBR are very simple and differ only by the convex node solver. BBA1 is faster than BBBR by an order of magnitude. BBA1 is also two to three times faster than BBA2, despite the fact that Algorithm 2 is faster for convex problems. This improvement is due to the warm start capabilities of Algorithm 1. BBA1 is considerably faster than the simplest CPLEX branch-and-bound algorithms CXBR and CXLP.
We see that BBA1 consistently outperforms CXLPE (which uses presolve and extended formulations), often by many factors. In fact, BBA1 resulted in better performance (faster solution times or lower end gaps) than CXLPE in every instance. Observe that in the cardinality instances with and path instances with , BBA1 requires half the number of nodes (or less) compared to CXLPE to solve the instances to optimality (since the relaxations solved at each node are stronger), which translates into faster overall solution times. In the more difficult instances BBA1 is able to solve more instances to optimality, and the end gaps are smaller.
Despite the fact that BBA1 is a rudimentary branch-and-bound implementation, it is faster than default CPLEX in most of the cases. Indeed, BBA1 outperforms CXD in 143 out of 160 instances tested. Figure 3 clearly shows that BBA1 solves more instances faster compared to CXLPE and CXD.
Warm starts
To quantify the impact of warm starts, we plot in Figure 4 the time per node (computed as solution time divided by the number of branch-and-bound nodes) for BBA1, BBA2, BBBR and CXLPE, and also plot the solution time for the corresponding convex instances with solvers ALG1, ALG2, and BAR111The time per node is similar for all combinations of parameters , and . We plot the average for all instances with ..
For the small cardinality instances with 200 variables, all three algorithms perform similarly for the convex instances; however, Algorithm 1 is 15 times faster than barrier and more than three times faster than Algorithm 2 when used in branch-and-bound due to the node warm starts from dual feasible solutions. For the larger path instances with 1,740 variables, Algorithm 1 is again close to three times faster than Algorithm 2 and almost 20 times faster than barrier in discrete instances. Finally, observe that the solve time per node for BBA1 is smaller compared to CXLPE: the proposed simplex-based algorithm is thus as effective as the simplex method for extended formulations in exploiting warm starts. Moreover, it solves the nonlinear convex relaxations at each node to optimality, whereas CXLPE solves its LP relaxation. The improved lower bounds lead to significantly small search trees.
We conclude that Algorithm 1 is indeed suitable for branch-and-bound algorithms since it benefits from node warms starts from the parent nodes, resulting in a significant improvement in solution times.
5. Conclusions
We consider minimization problems with a conic quadratic objective and linear constraints, which are natural generalizations of linear programming and quadratic programming. Using the perspective function we reformulate the objective and propose simplex QP-based algorithms that solve a quadratic program at each iteration. Computational experiments indicate that the proposed algorithms are faster than interior point methods by orders of magnitude, scale better with the dimension of the problem, return higher precision solutions, and, most importantly, are amenable to warm starts. Therefore, they can be embedded in branch-and-bound algorithms quite effectively.
Acknowledgement
This research is supported, in part, by grant FA9550-10-1-0168 from the Office of the Assistant Secretary of Defense for Research and Engineering.
Appendix A Branch-and-bound algorithm
Algorithm 3 describes the branch-and-bound algorithm used in computations. Throughout the algorithm, we maintain a list of the nodes to be processed. Each node is a tuple , where is the subproblem, is a basis for warm starting the continuous solver and is a lower bound on the objective value of . In line 5 list is initialized with the root node. For each node, the algorithm calls a continuous solver (line 11) which returns a tuple , where is an optimal solution of , is the corresponding optimal basis and is the optimal objective value (or if is infeasible). The algorithm then checks whether the node can be pruned (lines 12-13), is integer (lines 14-17), or it further branching is needed (lines 18-20).
We now describe the specific implementations of the different subroutines. For branching (line 19) we use the maximum infeasibility rule, which chooses the variable with value furtherest from an integer (ties broken arbitrarily). The subproblems and in line 20 are created by imposing the constraints and , respectively. The PULL routine in line 7 chooses, when possible, the child of the previous node which violates the bound constraint by the least amount, and chooses the node with the smallest lower bound when the previous node has no child nodes. The list is thus implemented as a sorted list ordered by the bounds, so that the PULL operation is done in and the insertion is done in (note that in line 20 we only add to the list the node that is not to be processed immediately). A solution is assumed to be integer (line 14) when the values of all variables are within of an integer. Finally, the algorithm is terminated when , where is the minimum lower bound among all the nodes in the tree.
The maximum infeasibility rule is chosen due to its simplicity. The other rules and parameters correspond to the ones used in CPLEX branch-and-bound algorithm in default configuration.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Ahmed and Atamtürk [2011] S. Ahmed and A. Atamtürk. Maximizing a class of submodular utility functions. Mathematical Programming , 128:149–169, 2011.
- 2Alizadeh [1995] F. Alizadeh. Interior point methods in semidefinite programming with applications to combinatorial optimization. SIAM Journal on Optimization , 5:13–51, 1995.
- 3Alizadeh and Goldfarb [2003] F. Alizadeh and D. Goldfarb. Second-order cone programming. Mathematical Programming , 95:3–51, 2003.
- 4Atamtürk and Goméz [2016] A. Atamtürk and A. Goméz. Submodularity in conic quadratic mixed 0-1 optimization. ar Xiv preprint ar Xiv:1705.05918 , 2016. BCOL Research Report 16.02, UC Berkeley.
- 5Atamtürk and Jeon [2017] A. Atamtürk and H. Jeon. Lifted polymatroid inequalities for mean-risk optimization with indicator variables. ar Xiv preprint ar Xiv:1705.05915 , 2017. BCOL Research Report 17.01, UC Berkeley.
- 6Atamtürk and Narayanan [2007] A. Atamtürk and V. Narayanan. Cuts for conic mixed-integer programming. In Matteo Fischetti and David P. Williamson, editors, Integer Programming and Combinatorial Optimization , pages 16–29, Berlin, Heidelberg, 2007. Springer. ISBN 978-3-540-72792-7.
- 7Atamtürk and Narayanan [2008] A. Atamtürk and V. Narayanan. Polymatroids and risk minimization in discrete optimization. Operations Research Letters , 36:618–622, 2008.
- 8Atamtürk and Narayanan [2009] A. Atamtürk and V. Narayanan. The submodular 0-1 knapsack polytope. Discrete Optimization , 6:333–344, 2009.
