Conic relaxation approaches for equal deployment problems
Sena Safarina, Satoko Moriguchi, Tim J. Mullin, and Makoto Yamashita

TL;DR
This paper explores conic relaxation methods for the equal deployment problem in breeding, proposing a steepest-ascent approach that yields high-quality solutions efficiently, especially using SOCP relaxations.
Contribution
It introduces conic relaxation techniques for the mixed-integer SOCP equal deployment problem and develops a steepest-ascent method to improve solution quality efficiently.
Findings
SOCP relaxation provides suitable solutions balancing quality and computation time.
The steepest-ascent method enhances solutions faster than traditional methods.
SDP bounds are less sharp for tree breeding problems.
Abstract
An important problem in the breeding of livestock, crops, and forest trees is the optimum of selection of genotypes that maximizes genetic gain. The key constraint in the optimal selection is a convex quadratic constraint that ensures genetic diversity, therefore, the optimal selection can be cast as a second-order cone programming (SOCP) problem. Yamashita et al. (2015) exploits the structural sparsity of the quadratic constraints and reduces the computation time drastically while attaining the same optimal solution. This paper is concerned with the special case of equal deployment (ED), in which we solve the optimal selection problem with the constraint that contribution of genotypes must either be a fixed size or zero. This involves a nature of combinatorial optimization, and the ED problem can be described as a mixed-integer SOCP problem. In this paper, we discuss conic…
| Z | lower bound | expected value | upper bound | ||
|---|---|---|---|---|---|
| 200 | 0.0334 | 16.161 | 25.812 | 30.340 | 25.386 |
| 1050 | 0.0627 | 5.075 | 32.305 | 112.600 | 24.938 |
| 2045 | 0.0711 | 279.259 | 446.089 | 2007.212 | 438.659 |
| 5050 | 0.1081 | 5.775 | 284.965 | 806.205 | 42.786 |
| Algorithm | iter | time (s) | |||||
|---|---|---|---|---|---|---|---|
| CR (LP) | 200 | 0.0334 | 28.068 | 0.0574 | 0 | -133.895 | 0.07 |
| SA (LP) | 25.029 | 0.0334 | 21 | 25.029 | 0.10 | ||
| CR (LP)-s | 28.068 | 0.0574 | 0 | -133.895 | 0.01 | ||
| SA (LP)-s | 25.029 | 0.0334 | 21 | 25.029 | 0.04 | ||
| CR (SOCP) | 26.156 | 0.0334 | 0 | -41.484 | 0.02 | ||
| SA (SOCP) | 25.090 | 0.0334 | 13 | 25.090 | 0.06 | ||
| CR (SDP) | 25.386 | 0.0321 | 0 | 18.978 | 1.29 | ||
| SA (SDP) | 25.207 | 0.0334 | 4 | 25.207 | 1.30 | ||
| CR (LP) | 1050 | 0.0627 | 30.754 | 0.1362 | 0 | -198.235 | 0.37 |
| SA (LP) | 22.707 | 0.0627 | 23 | 22.707 | 0.51 | ||
| CR (LP)-s | 30.754 | 0.1362 | 0 | -198.235 | 0.01 | ||
| SA (LP)-s | 22.707 | 0.0627 | 23 | 22.707 | 0.15 | ||
| CR (SOCP) | 25.284 | 0.0627 | 0 | 19.621 | 0.08 | ||
| SA (SOCP) | 24.831 | 0.0627 | 2 | 24.831 | 0.09 | ||
| CR (SDP) | 24.938 | 0.0617 | 0 | 24.721 | 27.94 | ||
| SA (SDP) | 24.846 | 0.0627 | 2 | 24.846 | 27.96 | ||
| CR (LP) | 2045 | 0.0711 | 504.217 | 0.4566 | 0 | -26197.137 | 1.16 |
| SA (LP) | 414.591 | 0.0710 | 32 | 414.591 | 1.47 | ||
| CR (LP)-s | 504.217 | 0.4566 | 0 | -26197.137 | 0.01 | ||
| SA (LP)-s | 414.591 | 0.0710 | 32 | 414.591 | 0.32 | ||
| CR (SOCP) | 439.353 | 0.0711 | 0 | 293.122 | 0.06 | ||
| SA (SOCP) | 438.386 | 0.0710 | 2 | 438.386 | 0.09 | ||
| CR (SDP) | 438.659 | 0.0706 | 0 | 438.457 | 145.57 | ||
| SA (SDP) | 438.457 | 0.0710 | 1 | 438.457 | 145.59 | ||
| CR (LP) | 5050 | 0.1081 | 57.630 | 0.3672 | 0 | -1185.866 | 10.17 |
| SA (LP) | 38.696 | 0.1080 | 23 | 38.696 | 11.17 | ||
| CR (LP)-s | 57.630 | 0.3672 | 0 | -1185.866 | 0.01 | ||
| SA (LP)-s | 38.696 | 0.1080 | 23 | 38.696 | 0.98 | ||
| CR (SOCP) | 43.036 | 0.1081 | 0 | 42.456 | 0.21 | ||
| SA (SOCP) | 42.691 | 0.1080 | 3 | 42.691 | 0.37 | ||
| CR (SDP) | 42.786 | 0.0980 | 0 | 41.327 | 2221.22 | ||
| SA (SDP) | 42.431 | 0.1080 | 3 | 42.431 | 2221.40 | ||
| CR (LP) | 10100 | 0.0701 | 62.377 | 0.2368 | 0 | -1305.4682 | 46.84 |
| SA (LP) | 41.284 | 0.0701 | 32 | 41.284 | 49.87 | ||
| CR (LP)-s | 62.377 | 0.2368 | 0 | -1305.468 | 0.01 | ||
| SA (LP)-s | 41.284 | 0.0701 | 32 | 41.284 | 3.29 | ||
| CR (SOCP) | 47.445 | 0.0701 | 0 | 21.094 | 0.54 | ||
| SA (SOCP) | 46.568 | 0.0701 | 2 | 46.568 | 0.87 | ||
| CR (SDP) | 21.265 | 0.0545 | 0 | 13.369 | 5577.80 | ||
| SA (SDP) | 44.662 | 0.0701 | 45 | 44.662 | 5582.46 | ||
| CR (LP) | 15222 | 0.0388 | 603.783 | 0.4568 | 0 | -67047.589 | 129.55 |
| SA (LP) | 438.791 | 0.0388 | 42 | 438.791 | 139.03 | ||
| CR (LP)-s | 603.783 | 0.4568 | 0 | -67047.589 | 0.01 | ||
| SA (LP)-s | 438.791 | 0.0388 | 42 | 438.791 | 6.45 | ||
| CR (SOCP) | 468.367 | 0.0388 | 0 | -1042.485 | 0.99 | ||
| SA (SOCP) | 460.769 | 0.0388 | 9 | 460.769 | 2.56 | ||
| CR (SDP) | 288.739 | 0.0195 | 0 | 314.493 | 17433.38 | ||
| SA (SDP) | 460.409 | 0.0388 | 43 | 460.409 | 17441.93 |
| Algorithm | iter | time (s) | |||||
|---|---|---|---|---|---|---|---|
| CR (LP) | 200 | 0.0258 | 24.654 | 0.0304 | 0 | -22.392 | 0.01 |
| SA (LP) | 23.355 | 0.0258 | 18 | 23.355 | 0.07 | ||
| CR (LP)-s | 24.654 | 0.0304 | 0 | -22.392 | 0.01 | ||
| SA (LP)-s | 23.355 | 0.0258 | 18 | 23.355 | 0.06 | ||
| CR (SOCP) | 24.015 | 0.0258 | 0 | 0.950 | 0.01 | ||
| SA (SOCP) | 23.412 | 0.0258 | 13 | 23.412 | 0.08 | ||
| CR (SDP) | 23.640 | 0.0255 | 0 | 21.783 | 0.82 | ||
| SA (SDP) | 23.521 | 0.0258 | 5 | 23.521 | 0.84 | ||
| CR (LP) | 1050 | 0.0539 | 27.637 | 0.1214 | 0 | -208.680 | 0.39 |
| SA (LP) | 19.805 | 0.0539 | 42 | 19.805 | 0.98 | ||
| CR (LP)-s | 27.637 | 0.1214 | 0 | -208.680 | 0.01 | ||
| SA (LP)-s | 19.805 | 0.0539 | 42 | 19.805 | 0.590 | ||
| CR (SOCP) | 22.432 | 0.0539 | 0 | 18.537 | 0.08 | ||
| SA (SOCP) | 22.321 | 0.0539 | 5 | 22.321 | 0.15 | ||
| CR (SDP) | 22.358 | 0.0537 | 0 | 22.242 | 30.49 | ||
| SA (SDP) | 22.324 | 0.0539 | 3 | 22.324 | 30.54 | ||
| CR (LP) | 2045 | 0.0628 | 478.114 | 0.4219 | 0 | -26349.725 | 1.17 |
| SA (LP) | 406.348 | 0.0628 | 65 | 406.348 | 5.00 | ||
| CR (LP)-s | 478.114 | 0.4219 | 0 | -26349.725 | 0.01 | ||
| SA (LP)-s | 406.348 | 0.0628 | 65 | 406.348 | 3.78 | ||
| CR (SOCP) | 421.696 | 0.0628 | 0 | 197.423 | 0.07 | ||
| SA (SOCP) | 421.113 | 0.0627 | 3 | 421.113 | 0.25 | ||
| CR (SDP) | 421.497 | 0.0627 | 0 | 364.014 | 165.33 | ||
| SA (SDP) | 421.425 | 0.0628 | 2 | 421.425 | 165.46 | ||
| CR (LP) | 5050 | 0.0994 | 54.903 | 0.3355 | 0 | -1137.701 | 10.35 |
| SA (LP) | 36.509 | 0.0994 | 50 | 36.509 | 17.44 | ||
| CR (LP)-s | 54.903 | 0.3355 | 0 | -1137.701 | 0.01 | ||
| SA (LP)-s | 36.509 | 0.0994 | 50 | 36.509 | 7.03 | ||
| CR (SOCP) | 40.769 | 0.0995 | 0 | 15.408 | 0.25 | ||
| SA (SOCP) | 40.629 | 0.0995 | 3 | 40.629 | 0.71 | ||
| CR (SDP) | 40.711 | 0.0992 | 0 | 31.447 | 2164.49 | ||
| SA (SDP) | 40.690 | 0.0994 | 2 | 40.690 | 2164.85 | ||
| CR (LP) | 10100 | 0.0610 | 60.347 | 0.2245 | 0 | -1395.786 | 46.71 |
| SA (LP) | 39.911 | 0.0610 | 62 | 39.911 | 68.54 | ||
| CR (LP)-s | 60.347 | 0.2245 | 0 | -1395.786 | 0.01 | ||
| SA (LP)-s | 39.911 | 0.0610 | 62 | 39.911 | 21.51 | ||
| CR (SOCP) | 44.819 | 0.0610 | 0 | 10.608 | 0.70 | ||
| SA (SOCP) | 44.522 | 0.0610 | 7 | 44.522 | 3.12 | ||
| CR (SDP) | 21.374 | 0.0532 | 0 | 18.463 | 6750.06 | ||
| SA (SDP) | 42.810 | 0.0610 | 66 | 42.810 | 6773.05 | ||
| CR (LP) | 15222 | 0.0300 | 575.227 | 0.4318 | 0 | -74482.507 | 128.21 |
| SA (LP) | 408.725 | 0.0300 | 90 | 408.725 | 185.33 | ||
| CR (LP)-s | 575.227 | 0.4318 | 0 | -74482.507 | 0.02 | ||
| SA (LP)-s | 408.725 | 0.0300 | 90 | 408.725 | 49.14 | ||
| CR (SOCP) | 444.730 | 0.0300 | 0 | -11.0395 | 1.05 | ||
| SA (SOCP) | 441.438 | 0.0300 | 6 | 441.438 | 4.72 | ||
| CR (SDP) | 309.173 | 0.0228 | 0 | -65291.694 | 19467.30 | ||
| SA (SDP) | 406.266 | 0.0300 | 92 | 406.266 | 19525.52 |
| Algorithm | #chosen | time (s) | |||||
|---|---|---|---|---|---|---|---|
| GENCONT | 200 | 0.0334 | 25.290 | 0.0342 | 20.087 | 50 | 0.06 |
| OPSEL | 25.191 | 0.0334 | 25.191 | 50 | 1779.13 | ||
| CPLEX | 25.190 | 0.0334 | 25.190 | 50 | 4270.77 | ||
| SA (SOCP) | 25.090 | 0.0334 | 25.090 | 50 | 0.06 | ||
| GENCONT | 1050 | 0.0627 | 24.983 | 0.0627 | 24.983 | 48 | 7.91 |
| OPSEL | 24.858 | 0.0627 | 24.858 | 50 | 10800 | ||
| CPLEX | Cannot obtain a sensible solution in 3 hours | 10800 | |||||
| SA (SOCP) | 24.831 | 0.0627 | 24.831 | 50 | 0.09 | ||
| GENCONT | 2045 | 0.0711 | 437.049 | 0.0694 | 437.049 | 50 | 88.46 |
| OPSEL | 435.826 | 0.0692 | 435.826 | 50 | 16.08 | ||
| CPLEX | 436.213 | 0.0680 | 436.212 | 50 | 0.37 | ||
| SA (SOCP) | 438.386 | 0.0710 | 438.386 | 50 | 0.09 | ||
| GENCONT | 5050 | 0.1081 | 42.780 | 0.1089 | -306.701 | 50 | 1769.72 |
| OPSEL | 42.702 | 0.1081 | 42.702 | 50 | 10800 | ||
| CPLEX | 42.456 | 0.1066 | 42.456 | 50 | 2.02 | ||
| SA (SOCP) | 42.691 | 0.1080 | 42.691 | 50 | 0.37 | ||
| GENCONT | 10100 | 0.0701 | Out of memory | ||||
| OPSEL | 46.252 | 0.0700 | 46.252 | 50 | 10800 | ||
| CPLEX | Cannot obtain a sensible solution in 3 hours | 10800 | |||||
| SA (SOCP) | 46.568 | 0.0701 | 46.568 | 50 | 0.87 | ||
| GENCONT | 15222 | 0.0388 | Out of memory | ||||
| OPSEL | 459.040 | 0.0388 | 459.040 | 50 | 10800 | ||
| CPLEX | 459.135 | 0.0386 | 459.135 | 50 | 39.20 | ||
| SA (SOCP) | 460.769 | 0.0388 | 460.769 | 50 | 2.56 | ||
| Algorithm | #chosen | time (s) | |||||
|---|---|---|---|---|---|---|---|
| GENCONT | 200 | 0.0258 | 23.640 | 0.0261 | 21.253 | 100 | 0.07 |
| OPSEL | 23.551 | 0.0258 | 23.551 | 100 | 10800 | ||
| CPLEX | 23.508 | 0.0258 | 23.508 | 100 | 1.42 | ||
| SA (SOCP) | 23.412 | 0.0258 | 23.412 | 100 | 0.08 | ||
| GENCONT | 1050 | 0.0539 | 22.749 | 0.0539 | 22.749 | 91 | 9.63 |
| OPSEL | 22.275 | 0.0539 | 22.275 | 100 | 304.89 | ||
| CPLEX | Cannot obtain a sensible solution in 3 hours | 10800 | |||||
| SA (SOCP) | 22.321 | 0.0539 | 22.321 | 100 | 0.15 | ||
| GENCONT | 2045 | 0.0628 | 421.005 | 0.0632 | 392.893 | 100 | 105.40 |
| OPSEL | 419.600 | 0.0613 | 419.600 | 100 | 6.85 | ||
| CPLEX | 420.748 | 0.0619 | 420.748 | 100 | 0.41 | ||
| SA (SOCP) | 421.113 | 0.0627 | 421.113 | 100 | 0.25 | ||
| GENCONT | 5050 | 0.0995 | 40.692 | 0.0997 | 40.692 | 100 | 1940.43 |
| OPSEL | 40.468 | 0.0994 | 40.468 | 100 | 50.46 | ||
| CPLEX | Cannot obtain a sensible solution in 3 hours | 10800 | |||||
| SA (SOCP) | 40.629 | 0.0995 | 40.629 | 100 | 0.71 | ||
| GENCONT | 10100 | 0.0610 | Out of memory | ||||
| OPSEL | 44.467 | 0.0696 | 44.467 | 100 | 10800 | ||
| CPLEX | Cannot obtain a sensible solution in 3 hours | 10800 | |||||
| SA (SOCP) | 44.522 | 0.0610 | 44.522 | 100 | 3.12 | ||
| GENCONT | 15222 | 0.0300 | Out of memory | ||||
| OPSEL | 441.770 | 0.0300 | 441.770 | 100 | 10800 | ||
| CPLEX | 440.996 | 0.0290 | 440.99640 | 100 | 7.45 | ||
| SA (SOCP) | 441.438 | 0.0300 | 441.438 | 100 | 4.72 | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Conic relaxation approaches for equal deployment problems
Sena Safarina 111 Department of Mathematical and Computing Science, Tokyo Institute of Technology, 2-12-1-W8-29 Ookayama, Meguro-ku, Tokyo 152-8552, Japan. (The work of M. Yamashita was partially supported by JSPS KAKENHI (Grant-in-Aid for Scientific Research (C), 15K00032).) , Satoko Moriguchi 222 School of Business Administration, Faculty of Urban Liberal Arts, Tokyo Metropolitan University, 1-1 Minami-Osawa, Hachioji-shi, Tokyo 192-0397, Japan. , Tim J. Mullin 333 The Swedish Forestry Research Institute (Skogforsk), Box 3, Sävar 918 21, Sweden; and 224 rue du Grand-Royal Est, QC, J2M 1R5, Canada. (The work of T. J. Mullin was partially supported by Föreningen Skogsträdsförd̈ling (The Swedish Tree Breeding Foundation), and Makoto Yamashita 11footnotemark: 1
Submitted: February 27, 2017.
Abstract: An important problem in the breeding of livestock, crops, and forest trees is the optimum of selection of genotypes that maximizes genetic gain. The key constraint in the optimal selection is a convex quadratic constraint that ensures genetic diversity, therefore, the optimal selection can be cast as a second-order cone programming (SOCP) problem. Yamashita et al. (2015) exploits the structural sparsity of the quadratic constraints and reduces the computation time drastically while attaining the same optimal solution.
This paper is concerned with the special case of equal deployment (ED), in which we solve the optimal selection problem with the constraint that contribution of genotypes must either be a fixed size or zero. This involves a nature of combinatorial optimization, and the ED problem can be described as a mixed-integer SOCP problem.
In this paper, we discuss conic relaxation approaches for the ED problem based on LP (linear programming), SOCP, and SDP (semidefinite programming). We analyze theoretical bounds derived from the SDP relaxation approaches using the work of Tseng (2003) and show that the theoretical bounds are not quite sharp for tree breeding problems. We propose a steepest-ascent method that combines the solution obtained from the conic relaxation problems with a concept from discrete convex optimization in order to acquire an approximate solution for the ED problem in a practical time. From numerical tests, we observed that among the LP, SOCP, and SDP relaxation problems, SOCP gave a suitable solution from the viewpoints of the optimal values and the computation time. The steepest-ascent method starting from the SOCP solution provides high-quality solutions much faster than an existing method that has been widely used for the optimal selection problems and a branch-and-bound method.
Keywords: Semidefinite programming, Second-order cone programming, Mixed-integer conic programming, Conic relaxation, Tree breeding, Equal deployment problem.
MCS2010 classification: 90C05 Linear programming, 90C11 Mixed integer programming, 90C22 Semidefinite programming, 90C25 Convex programming, 90C59 Approximation methods and heuristics, 90C90 Applications of mathematical programming, 92-08 Biology and other natural sciences (Computational methods).
1 Introduction
Computational methods based on mathematical optimization have started gaining attention from breeding researchers, since the optimization methods provide efficient approaches and give theoretical aspects for the optimality of the obtained solutions. For example, optimal selection problems that determine the contributions of genotypes are studied for clonal seed orchards and dairy cattle [5, 13, 15, 19, 22, 24, etc].
A main objective in optimal selection problems is to attain the highest response from a genotype selection. Lindgren et al. [19] proposed a linear deployment in which the genotype contributions are basically proportional to their breeding values. This deployment was derived from a concept that the genotypes with higher breeding values should appear more frequently than those with lower values. An advantage of the linear deployment was the extremely low computation cost, since it could be computed by a greedy algorithm. However, the linear deployment worked well only when the pedigree situation was simple, that is, the candidate genotypes were unrelated. If the selected genotypes do not embrace enough diversity, the response will critically diminish through inbreeding depression [6, 40] due to accumulated kinship.
Meuwissen [22] introduced a quadratic constraint to control a group coancestry under an appropriate level. He developed the Lagrangian multiplier method to maximize the genetic response with the quadratic constraints. This method was implemented in a software package GENCONT [22], and it has been widely accepted among breeding researchers. A serious drawback of the Lagrangian multiplier method is that this method does not always generate optimal solutions. In contrast, Pong-Wong et al. [31] employed an SDP approach. This approach is based on mathematical optimization, and they demonstrated that this approach gave the optimal contributions exactly. This approach was extended in [1], but their SDP approach required long computation time even when they used parallel computing with the help of SDPA (a high-performance solver for SDPs) [45, 46]. Recently, Yamashita et al. [47] proposed an SOCP (second-order cone programming) approach and successfully reduced the computation time of the SDP approach attaining the same optimal solution.
The problems solved by the SDP approach [31] and the SOCP approach [47] are unequal deployment (UD) problems of form
[TABLE]
Throughout this paper, we use to denote the number of candidate genotypes. In the UD problem, the variable is the vector \mbox{\boldmathx}\in\mbox{\mathbb{R}}^{Z}, and indicates the contribution of the th genotype. We use a superscript to denote the transpose of a vector or a matrix. The cost vector \mbox{\boldmathg}\in\mbox{\mathbb{R}}^{Z} in the objective function is the estimated breeding value (EBV) [21]. Since this vector is computed separately, we regard as a constant vector. The matrix \mbox{\boldmathA}\in\mbox{\mathbb{R}}^{Z\times Z} is the Wright numerator matrix [43]. The elements of this matrix are given from the information of heredity diagram. We should emphasize that the matrix is always symmetric and positive definite. Hence, with a given constant , the constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta is a convex constraint, and this quadratic constraint ensures that the group coancestry \frac{\mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}}{2} in the selected group is kept under a permissible range . We use \mbox{\boldmathe}\in\mbox{\mathbb{R}}^{n} to denote the vector of all ones, therefore, the constraint \mbox{\boldmathe}^{T}\mbox{\boldmathx}=1 indicates that the total contribution of all the candidates is unity. In addition, the vectors \mbox{\boldmathl}\in\mbox{\mathbb{R}}^{Z} and \mbox{\boldmathu}\in\mbox{\mathbb{R}}^{Z} are the lower and upper bounds of the variable , respectively.
The name an unequal deployment indicates that the contributions need not to be equal. Since the variable is a continuous variable and the constraints are linear or convex-quadratic, the UD problem can be cast a type of SOCP problems, as pointed in [47]. Therefore, the UD problem can be solved in a polynomial time algorithm, for example, interior-point methods for SOCP [2, 7, 38].
This paper is concerned with the special-case problem of equal deployment (ED) form
[TABLE]
We use to denote the optimal value of this problem. The crucial difference from the UD problem is that the ED problem has the binary constraints . We choose exactly genotypes from candidates, and the selected genotypes must contribute their genes equally. The ED problems fit breeding populations, where we consider the selected genotypes should contribute with the same amount and therefore we require a fixed-size population.
Weng et al. [39] solved the ED problem only with the linear constraints and the binary constraints using the “Solver” tool in Microsoft Excel. Meuwisen extended GENCONT to the ED problems incorporating some heuristic methods so that GENCONT generated approximate solutions that satisfy the binary constraints. The heuristic methods implemented in GENCONT are partially discussed in [42].
From the viewpoint of mathematical optimization, the most difficult constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta is a quadratic convex constraint. An ED problem (11) can thus be viewed as a mixed-integer second-order cone programing (MI-SOCP) problem. Many approaches have been explored to solve MI-SOCP efficiently. Ben-tal and Nemirovski [3] proposed a polyhedral relaxation that approximates a second-order cone with a polyhedron so that the resulting problem can be handled with software packages for mixed-integer linear programming. Drewes applied an outer approximation method and a branch-and-cut method [8]. For other approaches, a survey paper due to Benson and Saglam [4] is a good reference.
Theoretically speaking, MI-SOCP is an SOCP problem with integer constraints, hence, we can obtain an exact optimal solution if we rely on the branch-and-bound framework. However, we suffer from a long computation time if we pursue the exact solution. For example, CPLEX can directly handle MI-SOCP problems, but fails to complete the computation of a small case and (it tried to choose genotypes from candidates) in one week. Mullin and Belotti [23] combined the outer approximation method and the branch-and-bound method and reduced the computation time. However, it also requires half a day for the small case to attain the gap , so it is still hard to say that this approach is practical for larger instances . To manage ED problems in a practical time, it is desirable that we find a high-quality approximate solution instead of the exact solution.
In this paper, we propose an integration of conic relaxation approaches and a steep-ascent method originally developed for discrete convex functions to derive a suitable solution for practical usage in a reasonable computation time.
An epoch-making paper on conic relaxation approach was the application of SDP problem to the max-cut problems by Goemans and Williamson [11]. They converted a feasible set of the max-cut problems into the space of positive semidefinite matrices with the rank-one constraint on the matrix variable, and they derived an SDP problem by ignoring this rank-one constraint. They showed that a solution generated with a randomized algorithm from an optimal solution of the resulting SDP problem gave very good approximation to the original max-cut problem. Following this achievement, the SDP relaxation approach has widely been applied to combinatorial optimization problems, see [41] and the references therein. Theoretical evaluation of the quality of the approximate solution were discussed in [14, 16, 29, 36, 49, etc]. Conic relaxation approaches are the relaxation approaches that employs linear programming (LP), SOCP or SDP problems. A remarkable points of the three conic programming problems (LP, SOCP, and SDP) is that they can be analyzed in the framework of Euclidean Jordan algebras [9, 32]. Hence, the resulting relaxation problems can be solved in polynomial time by interior-point methods [30] and many software packages are available [33, 35, 45]. Kim and Kojima [17] reported a numerical evaluation on the relaxation approaches using LP, SOCP, and SDP for some quadratic optimization problems.
On the other hand, discrete convex optimization has another abundant research direction. We might consider that a convex function in continuous space is a discrete convex function if we restrict the variable space to the integer points, although this naive intuition is not appropriate because such a function does not always have useful properties of convex functions, and some deep combinatorial or discrete-mathematical considerations are needed for discrete convexity. In the theory of discrete convex analysis [26], two convexity concepts, called L-convexity and M-convexity, play primary roles. L-convex functions and M-convex functions are convex functions with additional combinatorial properties distinguished by ”L” and ”M”, which are conjugate to each other through a discrete version of the Legendre-Fenchel transformation. If a function is an M-convex function, a step-descent method proposed in [27] can find its global minimum.
In this paper, we first introduce conic relaxation problems for the ED problems, and discuss the relations between the relaxation problems. We analyze the theoretical bounds of the randomized algorithm starting from the solution of the SDP relaxation problem. However, when we numerically evaluate these bounds using tree-breeding datasets, we learn that these bounds are not so sharp. Instead of pursuing an exact solution by branch-and-bound frameworks that impose heavy computation costs, our focus is to acquire a favorable solution that is available in a practical computation time. To obtain such a solution, we develop a steep-ascent method that employs the solution obtained from the conic relaxation problems as a starting point. The usual steep-descent method [27] minimizes an objective function on a particular feasible set. Since the ED problem is a maximization problem, we consider a steep-ascent method instead of a steep-descent method. We embed the quadratic constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta into the objective function as a penalty term with a weight computed from the Lagrange multiplier. This new objective function is not an M-concave function, therefore, we cannot guarantee that the solution obtained by the steep-ascent method is a global solution of the ED problem. However, through numerical experiments, we observe that the steep-ascent method generates qualified solutions for the ED problem. In particular, the steep-ascent method starting with the SOCP relaxation problem attains the best performance among the LP, SOCP, and SDP relaxation problems. Actually, we verify from numerical experiments that this approach performs better than existing methods like GENCONT in the viewpoints of both solution quality and computation time.
The rest of this paper is organized as follows. In Section 2, we introduce LP, SOCP, and SDP relaxation problems for the ED problems, and we discuss the strength of these conic relaxations. In Section 3, we analyze the approximation rate of the SDP relaxation based on the work of Tseng [36]. Section 4 gives the details of the steep-ascent method specialized for the ED problems. In Section 5, we present numerical results to compare the conic relaxations and to evaluate the solution acquired by the steep-ascent method. We also compare this result with existing methods. In Section 6, we will give a conclusion and discuss future directions.
1.1 Notation
We use to denote the cardinality of a set . The vector \mbox{\boldmathe}_{S} is the vector of all ones of the lengths . In contrast, we denote by \mbox{\boldmathe}_{i} the vector of all zeros except one in the th position. The symbol \mbox{\mathbb{S}}^{n} is used to denote the space of symmetric matrices, and \mbox{\boldmathX}\succeq\mbox{\boldmathO} indicates that a symmetric matrix is positive semidefinite. The inner-product between \mbox{\boldmathA}\in\mbox{\mathbb{S}}^{n} and \mbox{\boldmathX}\in\mbox{\mathbb{S}}^{n} is defined by \mbox{\boldmathA}\bullet\mbox{\boldmathX}:=\sum_{i=1}^{n}\sum_{j=1}^{n}A_{ij}X_{ij}. The trace of a matrix \mbox{\boldmathA}\in\mbox{\mathbb{S}}^{n} is given by \mbox{Trace}(\mbox{\boldmathA}):=\sum_{i=1}^{n}A_{ii}. For a vector \mbox{\boldmathx}\in\mbox{\mathbb{R}}^{n}, \mbox{\boldmathx}\geq\mbox{\bf 0} indicates the element-wise non-negativity of , that is, .
2 Conic relaxations for equally deployment problems
In this section, we first derive an SDP relaxation problem of an ED problem. Then, by a further relaxation of the positive semidefinite condition using a relaxation technique proposed in [17], we obtain an LP relaxation problem. Finally, we apply a continuous relaxation technique to the ED problem to obtain an SOCP relaxation problem. The reason we employ a different relaxation approach for only the SOCP relaxation is that we can exploit a structural sparsity in the Wright numerator matrix .
A standard form of SDP problems can be given as follow:
[TABLE]
In this standard form, the variable matrix is \mbox{\boldmathX}\in S^{n}. The input matrices in (15) are \mbox{\boldmathC}, \mbox{\boldmathF}_{1},\ldots,\mbox{\boldmathF}_{m}\in\mbox{\mathbb{S}}^{n}, while the vector \mbox{\boldmathb}\in\mbox{\mathbb{R}}^{n} is an input vector. Shortly speaking, a standard SDP form minimizes a linear objective function over linear constraints and a positive semidefinite condition on .
As a first step to derive an SDP relaxation from the ED problem (11), we remove the variables that can be fixed from the box constraints. More precisely, if , we fix . Similarly, we fix if . We ignore the cases , or , since we can immediately detect the infeasibility of the ED problem. Then, we define two sets and so that the two sets separate the set disjointly and is fixed to for while remains as a decision variable for .
Without loss of generality, we assume that , , and . Along with these and , we introduce the vectors \mbox{\boldmathx}_{V} and \mbox{\boldmathc}_{F} that divide \mbox{\boldmathx}\in\mbox{\mathbb{R}}^{Z} into the two parts \mbox{\boldmathx}=\left(\begin{array}[]{cc}\mbox{\boldmathx}_{V}\\ \mbox{\boldmathc}_{F}\end{array}\right). We also divide the Wright numerator matrix into the four parts; \mbox{\boldmathA}=\left(\begin{array}[]{cc}\mbox{\boldmathA}_{VV}&\mbox{\boldmathA}_{VF}\\ \mbox{\boldmathA}_{FV}&\mbox{\boldmathA}_{FF}\end{array}\right). The sizes of \mbox{\boldmathA}_{VV}, \mbox{\boldmathA}_{FV}(=\mbox{\boldmathA}_{VF}^{T}), and \mbox{\boldmathA}_{FF} are , , and , respectively. We further partition the vectors and the matrices that appear in the ED problem into the corresponding parts;
[TABLE]
Note that we also removed the box constraints \mbox{\boldmathl}\leq\mbox{\boldmathx}\leq\mbox{\boldmathu} from the ED problem by fixing the variables in \mbox{\boldmathx}_{F} to \mbox{\boldmathc}_{F}. We count the number of that is fixed to by . Therefore, we will choose genotypes from candidates in (20), while we choose genotypes from candidates in the original ED problem (11).
Remark 2.1**.**
We can assume and without loss of generality. In the case , we can detect the infeasibility of the problem (11). If , we have . Therefore, is also fixed with , and all the variables can be fixed without solving (20).
We change the decision variables by \mbox{\boldmathy}_{V}:=2N\mbox{\boldmathx}_{V}-\mbox{\boldmathe}_{V}\in\mbox{\mathbb{R}}^{|V|} and we use to denote the th element of \mbox{\boldmathy}_{V}. Then, the binary constraints are mapped to . Even without employing this variable change, we can also directly apply the SDP relaxation method in a similar way to [12]. The reason we employed this variable change is for the later discussion in Section 4 so that most of the matrices \mbox{\boldmathB}^{k} there will be diagonal matrices.
We will denote the th element of \mbox{\boldmathy}_{V} by . We define , \bar{\mbox{\boldmathg}}_{V}:=\frac{1}{4N}(\mbox{\boldmathg}_{V}-g_{\min}\mbox{\boldmathe}_{V}), \bar{g}:=\frac{1}{2N}(\mbox{\boldmathg}_{V}-g_{\min}\mbox{\boldmathe}_{V})^{T}\mbox{\boldmathe}_{V}+(\mbox{\boldmathg}_{F}-g_{\min}\mbox{\boldmathe}_{F})^{T}\mbox{\boldmathc}_{F}+g_{\min}, \bar{\mbox{\boldmathc}}_{F}:=\mbox{\boldmathA}_{VV}\mbox{\boldmathe}_{V}+2N\mbox{\boldmathA}_{VF}\mbox{\boldmathc}_{F}, \bar{\theta}:=2N^{2}(2\theta-\mbox{\boldmathc}_{F}^{T}\mbox{\boldmathA}_{FF}\mbox{\boldmathc}_{F})-\frac{1}{2}\mbox{\boldmathe}_{V}^{T}\mbox{\boldmathA}_{VV}\mbox{\boldmathe}_{V}-2N\mbox{\boldmathc}_{F}^{T}\mbox{\boldmathA}_{FV}\mbox{\boldmathe}_{V}, and \bar{N}:=2N(1-\mbox{\boldmathe}_{F}^{T}\mbox{\boldmathc}_{F})-|V|=2(N-p)-|V|. From these definitions, it is easy to check \bar{\mbox{\boldmathg}}_{V}\geq\mbox{\bf 0} and \mbox{\boldmathg}^{T}\mbox{\boldmathx}=2\bar{\mbox{\boldmathg}}_{V}^{T}\mbox{\boldmathy}_{V}+\bar{g} using \mbox{\boldmathe}^{T}\mbox{\boldmathx}=1. We now have another expression of the ED problem;
[TABLE]
By introducing a variable matrix \mbox{\boldmathY}_{VV}\in\mbox{\mathbb{S}}^{|V|}, we apply the lift-and-project method of Lovász and Schrijver [20]. As a result, we obtain one more equivalent form;
[TABLE]
The key property for the equivalence between (25) and 56 is \mbox{\boldmathY}_{VV}=\mbox{\boldmathy}_{V}\mbox{\boldmathy}_{V}^{T} from the rank-1 constraint on the matrix \left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right). We will denote the th element of \mbox{\boldmathY}_{VV} by . The equality for should holds for feasible solution of (56), hence \left(\begin{array}[]{cc}-1&\mbox{\bf 0}^{T}\\ \mbox{\bf 0}&\mbox{\boldmathe}_{i}\mbox{\boldmathe}_{i}^{T}\end{array}\right)\bullet\left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right)=0 leads to the binary constraint . In (56), we introduced a redundant constraint (\mbox{\boldmathe}_{V}\mbox{\boldmathe}_{V}^{T})\bullet\mbox{\boldmathY}_{VV}=\bar{N}^{2} that was derived from (\mbox{\boldmathe}_{V}^{T}\mbox{\boldmathy}_{V})^{2}=\bar{N}^{2} and \mbox{\boldmathY}_{VV}=\mbox{\boldmathy}_{V}\mbox{\boldmathy}_{V}^{T}. It is known that redundant constraints of this type make the SDP relaxation tighter, and we can often obtain better approximate solution. The hardest constraint in (56) is the rank-1 constraint. This constraint embraces a nature of combinatorial optimization. By removing this hardest constraint, we build an SDP relaxation problem and we denote its optimal value by .
[TABLE]
When we further relax the positive semidefinite constraint \left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right)\succeq\mbox{\boldmathO}, we can obtain an LP relaxation problem. In general, a matrix \mbox{\boldmathX}\in\mbox{\mathbb{S}}^{n} is positive semidefinite if and only if \mbox{\boldmathu}^{T}\mbox{\boldmathX}\mbox{\boldmathu}\geq 0 for \forall\mbox{\boldmathu}\in\mbox{\mathbb{R}}^{n}. For the positive semidefinite constraint of (85), we choose a set of vectors \mbox{\boldmathu}_{ij}=\mbox{\boldmathe}_{i}-\mbox{\boldmathe}_{j}\in\mbox{\mathbb{R}}^{1+|V|} for and as a subset of \mbox{\mathbb{R}}^{1+|V|}. We use to denote the non-diagonal upper-triangular position of \mbox{\boldmathY}_{VV}, that is . The key step to derive an LP relaxation problem is the following step:
[TABLE]
From the constraints for in (85), the constraints and are linear constraints in nature. Consequently, we reach an LP relaxation problem, whose optimal value is denoted as .
[TABLE]
We now move our focus to an SOCP relaxation problem. In a similar way to the above step that derives (119) from (85), it may be possible to apply an SOCP relaxation technique developed in [17] to (85). In contrast, we utilize a continuous relaxation technique that converts the binary constraint into a continuous constraint . The main reason of this continuous relaxation is that we can keep the efficient SOCP formula of [47] that extensively exploits a structural sparsity of the Wright numerator matrix .
A second-order cone of dimension is defined by \mbox{\cal K}^{q}:=\left\{\mbox{\boldmathx}\in\mbox{\mathbb{R}}^{q}:x_{1}\geq\sqrt{\sum_{i=2}^{n}x_{i}^{2}}\right\}. A standard form of second-order cone programming (SOCP) problem in this paper is given as follows:
[TABLE]
The decision variable here is \mbox{\boldmathx}\in\mbox{\mathbb{R}}^{n} and the objective function is a linear function with a constant vector \mbox{\boldmathc}\in\mbox{\mathbb{R}}^{n}. The linear constraints are encoded with a matrix \mbox{\boldmathF}\in\mbox{\mathbb{R}}^{m\times n} and a vector \mbox{\boldmathb}\in\mbox{\mathbb{R}}^{m}. The second-oder cone constraint is given with a vector \mbox{\boldmathh}\in\mbox{\mathbb{R}}^{q} and a matrix \mbox{\boldmathH}\in\mbox{\mathbb{R}}^{q\times n}. A more general SOCP formulation often includes a Cartesian product of second-order cones. However, only one second-order cone is enough for the discussions in this paper.
Yamashita et al. [47] introduced a new vector \mbox{\boldmathz}:=\mbox{\boldmathA}\mbox{\boldmathx}\in\mbox{\mathbb{R}}^{Z}, and converted the quadratic constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta into ||\mbox{\boldmathB}\mbox{\boldmathz}||\leq\sqrt{2\theta} with a matrix \mbox{\boldmathB}\in\mbox{\mathbb{R}}^{Z\times Z} that satisfies \mbox{\boldmathB}^{T}\mbox{\boldmathB}=\mbox{\boldmathA}^{-1}. Though the Wright numerator matrix itself is not a sparse matrix, the matrices \mbox{\boldmathA}^{-1} and possess favorable sparsity. The computation time reduction reported in [47] was mainly derived from these sparsity. Using these new vector and matrix , we transformed the ED problem (11) into the following SOCP problem with integer constraints;
[TABLE]
Here, we use the notation [\mbox{\boldmathA}^{-1}\mbox{\boldmathz}]_{i} to denote the th element of \mbox{\boldmathA}^{-1}\mbox{\boldmathz}. It may seem that we would remove \mbox{\boldmathx}_{F} from this formulation by fixing \mbox{\boldmathx}_{F}=\mbox{\boldmathc}_{F} and reduce the sizes the problem. However, such elimination would strongly diminish the efficiency of the SOCP problem, since it completely destroys the favorable sparsity that appear in \mbox{\boldmathA}^{-1} and .
By applying the continuous relaxation to the binary constraints, we obtain an SOCP relaxation problem of the ED problem;
[TABLE]
, , and , respectively. From the derivation of the LP relaxation problem (119), it is natural that the SDP relaxation problem (85) gives closer an optimal value than the LP relaxation problem, that is, we know . In contrast, the relation of the SOCP relaxation (135) is not so explicit, since the SOCP relaxation was derived by a continuous relaxation independently from the SDP or LP relaxation.
The strength of these relaxation problems can be summarized in Lemma 2.3. For the discussion there, we prepare some notation and introduce an assumption. We use \mbox{\cal S}_{m}(\mbox{\boldmathv}) to denote the sum of the smallest elements of \mbox{\boldmathv}\in\mbox{\mathbb{R}}^{n}. More precisely, when is the sorted vector of in the ascending order, the definition of \mbox{\cal S}_{m}(\mbox{\boldmathv}) is given by \mbox{\cal S}_{m}(\mbox{\boldmathv}):=\sum_{i=1}^{m}\hat{v}_{i}. The symbol indicates the set of the collection of \mbox{\boldmathA}_{VV} with respect to , that is, . We define a vector \hat{\mbox{\boldmathy}}_{V}\in\mbox{\mathbb{R}}^{|V|} by for and for . This vector satisfies \mbox{\boldmathe}_{V}^{T}\hat{\mbox{\boldmathy}}_{V}=\bar{N}. In the following this discussion, we make the following assumption on the input data of the ED problem (11). From preliminary numerical tests, we verified that this assumption holds for practical datasets of pine orchards and datasets generated by simulations. The details of these dataset will be described in Section 5.
Assumption 2.2**.**
The input data of (11) satisfies
[TABLE]
where .
We should ensure that is a positive integer, otherwise we need to manage a fractional number in the definition of . The positiveness is derived from by of Remark 2.1, and is integer by
[TABLE]
We are now prepared to examine the relation between the relaxation problems.
Lemma 2.3**.**
It holds for the optimal values of the relaxation problems that
[TABLE]
Furthermore, if Assumption 2.2 holds, then
[TABLE]
Proof: [] When we derived (85), we ignored the rank-1 constraint in (56). From this derivation, for any feasible solution \mbox{\boldmathx}\in\mbox{\mathbb{R}}^{Z} of (11), the corresponding vector \mbox{\boldmathy}_{V}\in\mbox{\mathbb{R}}^{|V|} through the connections \mbox{\boldmathx}=\left(\begin{array}[]{cc}\mbox{\boldmathx}_{V}\\ \mbox{\boldmathc}_{F}\end{array}\right), then \mbox{\boldmathy}_{V}=2N\mbox{\boldmathx}_{V}-\mbox{\boldmathe}_{V} is also a feasible solution of (85). Furthermore, from these connections hold, it holds that \mbox{\boldmathg}^{T}\mbox{\boldmathx}=\bar{\mbox{\boldmathg}}^{T}\mbox{\boldmathy}_{V}+\bar{g}. The objective functions of (56) and (85) are same and the feasible region of (85) is wider than that of (56) substantially, hence, we have .
[] We take any feasible solution \mbox{\boldmathy}_{V}\in\mbox{\mathbb{R}}^{|V|} and \mbox{\boldmathY}_{VV}\in\mbox{\mathbb{S}}^{|V|} of (85). It is enough to check that \mbox{\boldmathz}=\mbox{\boldmathA}\left(\begin{array}[]{c}\frac{\mbox{\boldmathy}_{V}+\mbox{\boldmathe}_{V}}{2N}\\ \mbox{\boldmathc}_{F}\end{array}\right) is a feasible solution of (135).
From \left(\begin{array}[]{cc}-2\bar{N}&\mbox{\boldmathe}_{V}^{T}\\ \mbox{\boldmathe}_{V}&\mbox{\boldmathO}\end{array}\right)\bullet\left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right)=0, we obtain \mbox{\boldmathe}_{V}^{T}\mbox{\boldmathy}_{V}=\bar{N}=2N(1-\mbox{\boldmathe}_{F}^{T}\mbox{\boldmathc}_{F})-|V|, hence,
[TABLE]
By applying the Schur complement to the positive semidefinite condition \left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right)\succeq\mbox{\boldmathO}, it holds \mbox{\boldmathY}_{VV}-\mbox{\boldmathy}_{V}\mbox{\boldmathy}_{V}^{T}\succeq\mbox{\boldmathO}. Since \mbox{\boldmathA}\bullet\mbox{\boldmathX}\geq 0 holds for any two positive semidefinite matrices of the same dimension and [34] and the Wright numerator matrix is always positive definite, it holds \mbox{\boldmathA}_{VV}\bullet(\mbox{\boldmathY}_{VV}-\mbox{\boldmathy}_{V}\mbox{\boldmathy}_{V}^{T})\geq 0, therefore, \mbox{\boldmathA}_{VV}\bullet\mbox{\boldmathY}_{VV}\geq\mbox{\boldmathy}_{V}^{T}\mbox{\boldmathA}_{VV}\mbox{\boldmathy}_{V}. Using the relation \left(\begin{array}[]{cc}-2\bar{\theta}&\bar{\mbox{\boldmathc}}_{F}^{T}\\ \bar{\mbox{\boldmathc}}_{F}&\mbox{\boldmathA}_{VV}\end{array}\right)\bullet\left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right)\leq 0, we obtain \mbox{\boldmathy}_{V}^{T}\mbox{\boldmathA}_{VV}\mbox{\boldmathy}_{V}+2\bar{\mbox{\boldmathc}}_{F}^{T}\mbox{\boldmathy}_{V}\leq 2\bar{\theta}. From the definitions of \mbox{\boldmathy}_{V},\bar{\mbox{\boldmathc}}_{F},\bar{\theta},\mbox{\boldmathB} and , we can derive \mbox{\boldmathz}\mbox{\boldmathB}^{T}\mbox{\boldmathB}\mbox{\boldmathz}\leq 2\theta, therefore, \left(\begin{array}[]{c}\sqrt{2\theta}\\ \mbox{\boldmathB}\mbox{\boldmathz}\end{array}\right)\in\mbox{\cal K}^{1+Z}. From \mbox{\boldmathY}_{VV}-\mbox{\boldmathy}_{V}\mbox{\boldmathy}_{V}^{T}\succeq\mbox{\boldmathO}, we also have for . Furthermore, due to the constraint \left(\begin{array}[]{cc}-1&\mbox{\bf 0}^{T}\\ \mbox{\bf 0}&\mbox{\boldmathe}_{i}\mbox{\boldmathe}_{i}^{T}\end{array}\right)\bullet\left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right)=0, it holds for , consequently -\mbox{\boldmathe}_{V}\leq\mbox{\boldmathy}_{V}\leq\mbox{\boldmathe}_{V}. From \mbox{\boldmathA}^{-1}\mbox{\boldmathz}=\left(\begin{array}[]{c}\frac{\mbox{\boldmathy}_{V}+\mbox{\boldmathe}_{V}}{2N}\\ \mbox{\boldmathc}_{F}\end{array}\right), it is now clear that 0\leq[\mbox{\boldmathA}^{-1}\mbox{\boldmathz}]_{i}\leq\frac{1}{N} for and that [\mbox{\boldmathA}^{-1}\mbox{\boldmathz}]_{i}=c_{i} for . Furthermore, the objective value of (85) at \mbox{\boldmathy}_{V} is same as that of (135) at if \mbox{\boldmathz}=\mbox{\boldmathA}\left(\begin{array}[]{c}\frac{\mbox{\boldmathy}_{V}+\mbox{\boldmathe}_{V}}{2N}\\ \mbox{\boldmathc}_{F}\end{array}\right). Hence, we obtain .
[] We first consider an LP problem
[TABLE]
where the decision variable is \mbox{\boldmath\eta}\in\mbox{\mathbb{R}}^{n} and the input vector is \mbox{\boldmathc}\in\mbox{\mathbb{R}}^{n} and is a positive integer. The optimal value of this LP problem is \mbox{\cal S}_{K}(\mbox{\boldmathc}) and this value can be attained at \hat{\mbox{\boldmath\eta}}\in\mbox{\mathbb{R}}^{n} such that for and for .
If we ignore the quadratic constraint of (135) and we reverse the variable into \mbox{\boldmathx}=\mbox{\boldmathA}^{-1}\mbox{\boldmathz}, we obtain an optimization problem of form
[TABLE]
Since , the optimal value of (145) is given by -\frac{\mbox{\cal S}_{N-p}(-\mbox{\boldmathg}_{V})}{N}+\mbox{\boldmathg}_{F}^{T}\mbox{\boldmathc}_{F} in a similar way to (140) and an optimal solution is \hat{\mbox{\boldmathx}}_{V}:=\frac{\hat{\mbox{\boldmathy}}_{V}+\mbox{\boldmathe}_{V}}{2N}. Therefore, it holds that OPT_{SOCP}\leq\mbox{\boldmathg}_{V}^{T}\hat{\mbox{\boldmathx}}_{V}+\mbox{\boldmathg}_{F}^{T}\mbox{\boldmathc}_{F}.
Next, we define to denote the optimal value of the following LP problem;
[TABLE]
We convert this problem introducing for . The following LP problem is equivalent to (151), therefore, its optimal value must be .
[TABLE]
The structure of this problem is same as (140), hence, it holds that \rho_{LP}=4\mbox{\cal S}_{\hat{N}}(\hat{A}_{\hat{N}})-\mbox{\boldmathe}_{V}^{T}\mbox{\boldmathA}_{VV}\mbox{\boldmathe}_{V}+2\mbox{Trace}(\mbox{\boldmathA}_{VV}).
Let \hat{\mbox{\boldmathY}}_{VV} be a part of an optimal solution of (151). From Assumption 2.2, it holds that
[TABLE]
Furthermore, \hat{\mbox{\boldmathy}}_{V} satisfies for and \mbox{\boldmathe}_{V}^{T}\hat{\mbox{\boldmathy}}_{V}=\bar{N} by its definition and \hat{\mbox{\boldmathY}}_{VV} satisfies all the constraints of (151). Consequently, the pair \hat{\mbox{\boldmathy}}_{V} and \hat{\mbox{\boldmathY}}_{VV} is a feasible solution of (119) and this leads to the inequality we wanted to obtain.
[TABLE]
∎
Remark 2.4**.**
Since an optimal solution of a further relaxation problem of (119)
[TABLE]
is \hat{\mbox{\boldmathy}}_{V}, its optimal value 2\bar{\mbox{\boldmathg}}_{V}^{T}\hat{\mbox{\boldmathy}}_{V}+\bar{\mbox{\boldmathg}} must be an upper bound of . On the other hand, from the proof of Lemma 2.3, when Assumption 2.2 holds, there exists some \hat{\mbox{\boldmathY}}_{VV}\in\mbox{\mathbb{S}}^{|V|} such that the pair of \hat{\mbox{\boldmathy}}_{V} and \hat{\mbox{\boldmathY}}_{VV} is a feasible solution of (119) with the objective value 2\bar{\mbox{\boldmathg}}_{V}^{T}\hat{\mbox{\boldmathy}}_{V}+\bar{\mbox{\boldmathg}}. Therefore, \hat{\mbox{\boldmathy}}_{V} is also an optimal solution of (119). This indicates that we can obtain the solution of (119) at the computation cost for sorting \mbox{\boldmathg}_{V} instead of solving (119) as an LP problem, when Assumption 2.2 holds.
This remark implies that the LP relaxation (119) is not so tight against the original ED problem (11). In contrast, we observed through preliminary numerical tests that the vector \left(\begin{array}[]{c}\hat{\mbox{\boldmathx}}_{V}\\ \mbox{\boldmathc}_{F}\end{array}\right) defined with an optimal solution \hat{\mbox{\boldmathx}}_{V} of (145), is not always a feasible solution of (135) even if Assumption 2.2 holds. Therefore, the feasible region of the SOCP relaxation problem is strictly narrower than that of the LP relaxation problem, and the SOCP relaxation gives a tighter approximation than the LP relaxation in general, even though the relaxation were derived independently.
Remark 2.5**.**
The SDP relaxation problem (85) has no interior-feasible point.
If a pair \mbox{\boldmathy}_{V}\in\mbox{\mathbb{R}}^{|V|} and \mbox{\boldmathY}_{VV}\in\mbox{\mathbb{S}}^{|V|} satisfies all the constraint of (85), the pair is a feasible point. When the matrix \left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right) is a positive definite matrix for some feasible point \mbox{\boldmathy}_{V}\in\mbox{\mathbb{R}}^{|V|} and \mbox{\boldmathY}_{VV}\in\mbox{\mathbb{S}}^{|V|}, we say that (85) has an interior-feasible point. We can show that \left(\begin{array}[]{cc}1&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right) is not positive definite for any feasible point of (85). To show this, we take a feasible point \mbox{\boldmathy}_{V}\in\mbox{\mathbb{R}}^{|V|} and \mbox{\boldmathY}_{VV}\in\mbox{\mathbb{S}}^{|V|}. Then, we have \mbox{\boldmathe}_{V}^{T}\mbox{\boldmathy}_{V}=\bar{N} and (\mbox{\boldmathe}_{V}\mbox{\boldmathe}_{V}^{T})\bullet\mbox{\boldmathY}_{VV}=\bar{N}^{2}. If , it holds
[TABLE]
In addition, for the case , it holds
[TABLE]
In either case, there exists a nonzero vector that makes the quadratic from zero, the matrix is not positive definite, therefore, (85) has no interior-feasible point.
3 Theoretical evaluation of the SDP relaxation problems with a randomized algorithm
The solutions obtained from the conic relaxation problem (85), (119) and (135) are not always a feasible solution of the ED problem (11), since we ignored some constraints of the NP-hard problem to derive the conic relaxation problems that are solvable in polynomial time. When SDP relaxation approaches are used, examine randomized algorithms often follow to generate feasible solutions. A randomized algorithm using the solutions obtained through SDP relaxation problems was first introduced for max-cut problems in [11]. They showed that the expectation objective value obtained by their randomized algorithm on average was at least 0.878 of that of an SDP relaxation problem. Since the optimal value of a max cut problem exists between an objective value of any feasible solution and the value obtained from the SDP relaxation problem, their algorithm have an expected approximation factor of 0.878. Many researches followed [11] to extend its results to more general quadratic-constraint problems using the framework of SDP relaxation methods. Among them, Tseng [36] discussed one of the most general cases and gave its probabilistic analysis. Wu et al. [44] also analyzed the expectation values using a different randomized algorithm.
In this section, we employ the result of [36] to give theoretical bounds on the expected objective value of a randomized algorithm. Tseng [36] applied the SDP relaxation methods to a quadratically-constrained quadratic programming (QCQP) problem:
[TABLE]
Here, the variable is \mbox{\boldmathy}\in\mbox{\mathbb{R}}^{n}, while the input data are \mbox{\boldmathA}^{0},\ldots,\mbox{\boldmathA}^{m}\in\mbox{\mathbb{S}}^{n}, \mbox{\boldmathb}^{0},\ldots,\mbox{\boldmathb}^{m}\in\mbox{\mathbb{R}}^{n} and c^{0},c^{1}\ldots,c^{m}\in\mbox{\mathbb{R}}. For simplicity, the constant in the objective function is fixed to . The QCQP originally discussed in [36] is a minimization problem, but we consider a maximization problem since the ED problem (11) is a maximization problem.
When we apply the lift-and-project method of Lovász and Schrijver [20] to (162), the resultant SDP relaxation problem is given as follows:
[TABLE]
where \mbox{\boldmathB}^{k}:=\left(\begin{array}[]{cc}c^{k}&(\mbox{\boldmathb}^{k})^{T}\\ \mbox{\boldmathb}^{k}&\mbox{\boldmathA}^{k}\end{array}\right) for and \mbox{\boldmathB}^{m+1}:=\left(\begin{array}[]{cc}1&\mbox{\bf 0}^{T}\\ \mbox{\bf 0}&\mbox{\boldmathO}\end{array}\right), and the decision variable is \mbox{\boldmathY}\in\mbox{\mathbb{S}}^{1+n}. In the following discussions, we start the row or column index of from zero, therefore, the elements of are denoted by . It is known that if we add the rank-1 constraint \mbox{rank}(\mbox{\boldmathY})=1 to (166), the two problems (162) and (166) are equivalent. In other words, we ignored the rank-1 constraint from (162) to derive (166), hence, .
The randomized algorithm of [36] can be summarized as follow. We assume that (162) and (166) are feasible and that (166) has an optimal solution, denoted as \mbox{\boldmathY}^{*}. This solution \mbox{\boldmathY}^{*} is factorzied with a matrix \mbox{\boldmathV}\in\mbox{\mathbb{R}}^{(1+n)\times(1+n)} such that \mbox{\boldmathY}^{*}=\mbox{\boldmathV}^{T}\mbox{\boldmathV}. Such is available, for example, by the Cholesky factorization or the eigenvalue decomposition. We use \mbox{\boldmathv}^{0},\mbox{\boldmathv}^{1},\ldots,\mbox{\boldmathv}^{n}\in\mbox{\mathbb{R}}^{1+n} to denote the columns of . Then, a vector \mbox{\boldmathv}\in\mbox{\mathbb{R}}^{1+n} is chosen randomly from the unit sphere in \mbox{\mathbb{R}}^{1+n} based on uniform distribution. Finally, the randomized algorithm outputs a solution \tilde{\mbox{\boldmathy}}\in\mbox{\mathbb{R}}^{1+n} defined by
[TABLE]
where if and if . We remark that from the definition of \mbox{\boldmathB}^{m+1}, it always holds that , hence, \tilde{y}_{0}=\sqrt{1}(\mbox{sign}(\mbox{\boldmathv}^{T}\mbox{\boldmathv}^{0}))^{2}=1.
The set is introduced to indicate diagonal-matrix constraints of (162);
[TABLE]
To measure a shift in the objective function, is defined as the optimal value of the following SDP problem
[TABLE]
Tseng [36] showed a relation between the expected objective value of the generated solution \tilde{\mbox{\boldmathy}} and the optimal values of the SDP problems.
Theorem 3.1**.**
[36, Theorem 2]* If the SDP relaxation problem (166) has an optimal solution \mbox{\boldmathY}^{*} and a set \left\{\mbox{\boldmathy}\in\mbox{\mathbb{R}}^{n}:\mbox{\boldmathy}^{T}\mbox{\boldmathA}^{k}\mbox{\boldmathy}+(\mbox{\boldmathb}^{k})^{T}\mbox{\boldmathy}+c^{k}\leq 0,k\in\mbox{\cal I}\right\} is bounded, then*
[TABLE]
Let us return to the ED problem (11). We analyze the performance of the output solution \tilde{\mbox{\boldmathy}}_{V} that is generated by the above randomized algorithm using the optimal solution \mbox{\boldmathY}^{*} of the SDP relaxation problem (85). From the form of (85), the objective value at \tilde{\mbox{\boldmathy}}_{V} is 2\mbox{\boldmathg}_{V}^{T}\tilde{\mbox{\boldmathy}}+\bar{g}. The following lemma provides a theoretical aspects on the expected value of this objective function.
Lemma 3.2**.**
For the ED problem (11), the expected objective value obtained through the randomized algorithm is bounded by
[TABLE]
where .
Proof:
First, we derive the lower bound of the objective function by use of Theorem 3.1. To embed the SDP relaxation problem (85) arising from the ED problem into the framework developed in [36], we embed the variable vector \mbox{\boldmathy}_{V} and matrix \mbox{\boldmathY}_{VV} into the matrix \mbox{\boldmathY}\in\mbox{\mathbb{S}}^{1+|V|} as \mbox{\boldmathY}=\left(\begin{array}[]{cc}Y_{00}&\mbox{\boldmathy}_{V}^{T}\\ \mbox{\boldmathy}_{V}&\mbox{\boldmathY}_{VV}\end{array}\right). In particular, we identify for . For the input matrices \mbox{\boldmathB}^{0},\ldots,\mbox{\boldmathB}^{2|V|+5}, we prepare
[TABLE]
The number of input matrices in the form of (166) is . For example, \mbox{\boldmathB}^{5+i}\bullet\mbox{\boldmathY}\leq 0 and \mbox{\boldmathB}^{5+|V|+i}\bullet\mbox{\boldmathY}\leq 0 lead to for . In addition, is guaranteed by \mbox{\boldmathB}^{m+1}\bullet\mbox{\boldmathY}=1.
The set of diagonal constraints is \mbox{\cal I}=\{5+i:i\in V\}\cup\{5+|V|+i:i\in V\}. From this , the feasible set of (170) is given by \mbox{\cal F}:=\left\{\mbox{\boldmathY}\in\mbox{\mathbb{S}}^{1+|V|}:Y_{ii}=1\mbox{ for }i=0,1,\ldots,|V|\mbox{ and }\mbox{\boldmathY}\succeq\mbox{\boldmathO}\right\}. Hence, we obtain
[TABLE]
Here, a combination of the matrix-completion method [10, 28, 48] with a property \bar{\mbox{\boldmathg}}_{V}\geq\mbox{\bf 0} ensures that an optimal solution of this minimization problem is given as \mbox{\boldmathY}=\left(\begin{array}[]{cc}1&-\mbox{\boldmathe}_{V}^{T}\\ -\mbox{\boldmathe}_{V}&\mbox{\boldmathe}_{V}\mbox{\boldmathe}_{V}^{T}\end{array}\right).
We should note that the SDP relaxation problem (85) has a constant term in the objective function, but we have to set to employ Theorem 3.1. By taking the shift of into account, Theorem 3.1 gives a lower bound;
[TABLE]
To consider an upper bound, we first evaluate for . From for in (85) and the definition of \tilde{\mbox{\boldmathy}}_{V}, and it holds that if \mbox{sign}(\mbox{\boldmathv}^{T}\mbox{\boldmathv}^{0})=\mbox{sign}(\mbox{\boldmathv}^{T}\mbox{\boldmathv}^{i}), and if \mbox{sign}(\mbox{\boldmathv}^{T}\mbox{\boldmathv}^{0})=-\mbox{sign}(\mbox{\boldmathv}^{T}\mbox{\boldmathv}^{i}). The discussion in [11] indicates that the probability of the event \mbox{sign}(\mbox{\boldmathv}^{T}\mbox{\boldmathv}^{0})=\mbox{sign}(\mbox{\boldmathv}^{T}\mbox{\boldmathv}^{0}) is given as . Therefore, we have
[TABLE]
The last inequality was derived from the inequality for (Lemma 3.4 of [11]) and due to \mbox{\boldmathY}^{*}\succeq\mbox{\boldmathO} and .
As a result, we obtain an inequality
[TABLE]
∎
From a theoretical viewpoint, Lemma 3.2 gives the bounds on the expected objective value E[2\bar{\mbox{\boldmathg}}_{V}^{T}\tilde{\mbox{\boldmathy}}_{V}+\bar{g}] of the randomized algorithm. When we executed preliminary experiments, we observed that the interval between the lower and upper bounds are not so sharp. Table 1 presents the lower and upper bounds and the expected objective value. The dataset we used here is a subset of datasets in Section 5. The first column shows , the number of genotype candidates. We fix the number of chosen candidates to . The third and fourth columns are the lower and the upper bounds in Lemma 3.2, respectively. The expected objective value is shown in the third column, and it is obtained by generating the random vector thousand times and taking the average of the thousand trials. The fifth column is the optimal value of the SDP relaxation problem 85.
For the smallest size , the gap between the lower and the upper bound was not so large. However, when we tried the larger problems, the gap was getting worse. In particular, the ratio of the upper bound to the expected objective value for the case goes beyond 5.34.
Another aspect in the randomized algorithm is that the expected objective value is always larger than . A reason of this unfavorable aspect is that the generated solution \tilde{\mbox{\boldmathy}}_{V} is not guaranteed to satisfy the constraint \mbox{\boldmathy}_{V}^{T}\mbox{\boldmathA}_{VV}\mbox{\boldmathy}_{V}+2\bar{\mbox{\boldmathc}}_{F}^{T}\mbox{\boldmathy}_{V}\leq 2\bar{\theta} that corresponds to \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta of (11). Though Theorem 4 of [36] estimates the number of randomly generated solutions required for approximate feasible solutions with high probability, this cannot be applied to the discussion in this paper, since the current discussion does not fully satisfy the assumption of the theorem.
Due to this weaker bounds reported in Table 1, we are determined to seek an optimization method that can obtain a reasonable solution for practical use. This motivated us to develop a local search method based on the steep-descent method for discrete convex functions.
4 Steepest-ascent method
In contrast to mixed-integer linear programming problems for which many solvers have been developed, a principal difficulty in the ED problem (11) arises from the nonlinear constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta. To obtain a sensible solution in a short time, we embed the violation against this constraint into the objective function as a penalty term using a penalty weight and focus the following optimization problem
[TABLE]
where \hat{\mbox{\cal F}}:=\left\{\mbox{\boldmathx}\in\mbox{\mathbb{R}}^{Z}:\mbox{\boldmathe}^{T}\mbox{\boldmathx}=1,\mbox{\boldmathl}\leq\mbox{\boldmathx}\leq\mbox{\boldmathu},x_{1},\ldots,x_{Z}\in\left\{0,\frac{1}{N}\right\}\right\}.
We give a validity of (174) by the next lemma which shows that if we take a large , this optimization problem with a penalty term (174) is equivalent to the original problem (11) .
Lemma 4.1**.**
Let \mbox{\boldmathx}(\lambda)\in\mbox{\mathbb{R}}^{Z} be an optimal solution of (174). There exists a such that \mbox{\boldmathx}(\lambda) is an optimal solution of (11) for .
Proof: Let be the optimal value of the following optimization problem;
[TABLE]
From this definition, can take either zero or a positive number.
If , the quadratic constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta holds for \forall\mbox{\boldmathx}\in\hat{\mbox{\cal F}}. Therefore, this constraint vanishes from (11) and the penalty term in (174) has no effect. Hence, the two problems (11) and (174) are equivalent for any .
For the case , since \hat{\mbox{\cal F}} is composed of a finite number of points, the reciprocal number of is a finite number. Therefore, we can take . To show this by a contradiction, we assume that \mbox{\boldmathx}(\lambda)^{T}\mbox{\boldmathA}\mbox{\boldmathx}(\lambda)\leq 2\theta does not hold for . Then, we have
[TABLE]
Here, we used \mbox{\boldmathe}^{T}\mbox{\boldmathx}(\lambda)=1 and \mbox{\boldmathx}(\lambda)\geq\mbox{\bf 0} since \mbox{\boldmathx}(\lambda)\in\hat{\mbox{\cal F}}. On the contrary, from the assumption that (11) has a feasible point, the objective value of (174) at this feasible point is at least . This indicates that \mbox{\boldmathx}(\lambda) can not be an optimal solution of (174) if \mbox{\boldmathx}(\lambda)^{T}\mbox{\boldmathA}\mbox{\boldmathx}(\lambda)>2\theta. Therefore, we can restrict the feasible region of (174) to the set \{\mbox{\boldmathx}\in\mbox{\mathbb{R}}^{Z}:\mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta\}\cap\hat{\mbox{\cal F}}, and the objective function of (174) is reduced to \mbox{\boldmathg}^{T}\mbox{\boldmathx}. Consequently, the optimal solution of (174) is also optimal for (11). ∎
Since the computation for is almost as hard as the original ED problem, it is not practical to compute . In addition, when we maximize f_{\lambda}(\mbox{\boldmathx}), extremely large makes the computation numerically unstable. As an appropriate value for the penalty weight , we make the use of the Lagrangian multiplier developed in Meuwissen [22];
[TABLE]
This corresponds to the Lagrangian multiplier of the constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}=2\theta in the following optimization problem.
[TABLE]
The approach of [22] first solves (181), then applies some heuristic method to obtain a solution of (11). Therefore, a maximization of \mbox{\boldmathg}^{T}\mbox{\boldmathx}-\lambda_{0}(\mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}-2\theta) over \mbox{\boldmathe}^{T}\mbox{\boldmathx}=1 is a natural derivation when we consider (181). For f_{\lambda}(\mbox{\boldmathx}), we often employ such that , since we put a strong emphasis on the violation with respect to \max\{\mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}-2\theta,0\}.
We now discuss (174) from the viewpoint of convex functions. The function -f_{\lambda}(\mbox{\boldmathx}) is a convex function in the continuous space \mbox{\mathbb{R}}^{Z}, since \mbox{\boldmathA}\succeq\mbox{\boldmathO} and . Hence, the problem (174) can be cast as a minimization of a convex function over a discrete feasible set.
An M-convex function [26] is a discrete convex function defined on a set in which the sum of the elements of a feasible point is constant. A steepest-descent method for M-convex functions was developed in [27]. When an M-convex function f^{M}(\mbox{\boldmathx}) with a feasible set \mbox{\cal F}^{M} is given, the steepest-descent method starts from an initial point \mbox{\boldmathx}^{0}\in\mbox{\cal F}^{M}, and finds the next point \mbox{\boldmathx}^{1} from a neighborhood \mbox{\cal N}(\mbox{\boldmathx}^{0})\subset\mbox{\cal F}^{M} that decreases the objective function f^{M}(\mbox{\boldmathx}) with the largest margin. Here, \mbox{\cal N}(\mbox{\boldmathx}^{0}):=\{\mbox{\boldmathx}+\mbox{\boldmathe}_{i}-\mbox{\boldmathe}_{j}\in\mbox{\cal F}^{M}:i,j=1,\ldots,n\}. In other words, \mbox{\boldmathx}^{1} is chosen so that f^{M}(\mbox{\boldmathx}_{1})\leq f^{M}(\mbox{\boldmathx}) for any \mbox{\boldmathx}\in\mbox{\cal N}(\mbox{\boldmathx}^{0}). The steepest-descent method continues the search in neighbors, and it eventually can find a global minimizer since any local minimizer is a global minimizer when the objective function is an M-convex function.
Though -f_{\lambda}(\mbox{\boldmathx}) is not an M-convex function since it can encompass multiple local minimizers that are not always global minimizers, the optimization problem with the penalty term (174) has resemblances to a minimization of an M-convex function. In particular, the feasible set \hat{\mbox{\cal F}} satisfies and the function -f_{\lambda}(\mbox{\boldmathx}) is a convex function in the continuous space \mbox{\mathbb{R}}^{Z}. Therefore, we can expect that the steepest-descent method for M-convex functions will give good direction to solve (174). Furthermore, we can exploit the solution obtained by the conic relaxation problems in Section 3 to generate a starting point \mbox{\boldmathx}^{0}.
When we adjust the steepest-descent method implemented in the software package ODICON 111http://ist.ksc.kwansei.ac.jp/~tutimura/odicon/index.en.html [37] to solve (174), we obtain Algorithm 4.2. Since (174) is a maximization problem, Algorithm 4.2 is a steepest-ascent method.
Algorithm 4.2**.**
A steep-ascent method with a conic relaxation problem for the optimization problem with the penalty term arising from the ED problem
- Step 1:
Solve a conic relaxation problem (85), (119) or (135). If (135) is solved, let \mbox{\boldmathx}^{*} be its optimal solution. For (119) and (85), let \mbox{\boldmathy}_{V}^{*} be its optimal solution and set \mbox{\boldmathx}^{*} by \mbox{\boldmathx}_{V}^{*}:=\mbox{\boldmathy}_{V}^{*} and \mbox{\boldmathx}_{F}^{*}:=\mbox{\boldmathc}_{F}. 2. Step 2:
By sorting \mbox{\boldmathx}^{*}, separate into the two disjoint set and such that for , and that (ties are broken arbitrary). Set the initial point \mbox{\boldmathx}^{0}\in\mbox{\mathbb{R}}^{Z} by for , for , and \mbox{\boldmathx}_{F}^{0}:=\mbox{\boldmathc}_{F}. Set the iteration counter . 3. Step 3:
Select the steepest swap and such that
[TABLE] 4. Step 4:
If there is no improvement, that is f_{\lambda}(\mbox{\boldmathx}^{h}-\frac{1}{N}\mbox{\boldmathe}_{i^{h}}+\frac{1}{N}\mbox{\boldmathe}_{j^{h}})\leq f_{\lambda}(\mbox{\boldmathx}^{h}), output \mbox{\boldmathx}^{h} as a solution and stop. 5. Step 5:
Set \mbox{\boldmathx}^{h+1}:=\mbox{\boldmathx}^{h}-\frac{1}{N}\mbox{\boldmathe}_{i^{h}}+\frac{1}{N}\mbox{\boldmathe}_{j^{h}}. Swap and by and . Set and return to Step 3.
In Step 2, the number of in \mbox{\boldmathx}^{0} is exactly . Due to Step 5, this property is kept through the iterations in the algorithm, hence, the number of in \mbox{\boldmathx}^{h} is also exactly for any . When no improvement can be found, the algorithm stops by Step 4.
Most computation cost of each iteration in Algorithm 4.2 is consumed at the evaluations of in Step 3. The number of the evaluations is determined by the size of neighbor around \mbox{\boldmathx}^{h}, that is, . Therefore, the case requires the heaviest computation cost. Furthermore, to reduce the computation cost, we focus the evaluation of (\mbox{\boldmathx}^{h}-\frac{1}{N}\mbox{\boldmathe}_{i^{h}}+\frac{1}{N}\mbox{\boldmathe}_{j^{h}})^{T}\mbox{\boldmathA}(\mbox{\boldmathx}^{h}-\frac{1}{N}\mbox{\boldmathe}_{i^{h}}+\frac{1}{N}\mbox{\boldmathe}_{j^{h}}). Since ODICON was designed to handle general functions, it accessed all the elements of for each and . By expanding the part as (\mbox{\boldmathx}^{h})^{T}\mbox{\boldmathA}(\mbox{\boldmathx}^{h})+\frac{2}{N}\left(-\mbox{\boldmathe}_{i^{h}}+\mbox{\boldmathe}_{j^{h}}\right)^{T}(\mbox{\boldmathA}\mbox{\boldmathx}^{h})+\frac{1}{N^{2}}\left(A_{i^{h}i^{h}}+A_{j^{h}j^{h}}-2A_{i^{h}j^{h}}\right), we evaluate (\mbox{\boldmathx}^{h})^{T}\mbox{\boldmathA}(\mbox{\boldmathx}^{h}) and (\mbox{\boldmathA}\mbox{\boldmathx}^{h}) only once for each iteration of Algorithm 4.2. This saves 95% of the computation time for Step 3 compared to ODICON.
5 Numerical results
In this section, we report numerical results to verify the performance of the proposed algorithm, Algorithm 4.2. We implemented Algorithm 4.2 with Matlab R2014b. We compared the proposed algorithm with GENCONT [22], the branch-and-bound method implemented in OPSEL 2.0 [24, 23], and IBM CPLEX 12.62. We used an Windows PC with Core i7 3770K (3.5 GHz) and 32 GB memory space for cases. Only when the 32 GB memory space was not enough, we used a Linux server with Opteron 4386 (3.10 GHz) and 128 GB memory space. To solve the LP problem (119), the SOCP problem (135), and the SDP problem (85), we employed CPLEX, ECOS [7], and SDPT-3 [35], respectively. For the steepest-ascent method, we set as the penalty weight in the function f_{\lambda}(\mbox{\boldmathx}) of (174).
The data tested in the numerical experiments of this paper are practical datasets of pine orchards available at the Dryad Digital Repository222http://dx.doi.org/10.5061/dryad.9pn5m and datasets generated by a simulation software package [25].
Tables 2 and 3 present the comparison of the three conic relaxation approaches. The number of chosen genotype is set 50 and 150 in Table 2 and Table 3, respectively. The first column in the tables shows the name of algorithms; for example, CR (LP) is the result of conic relaxation problem (in this case, an LP problem) and SA (LP) is the result after the application of Algorithm 4.2 starting from the solution of CR (LP). The names for SOCP and SDP are indicated with the same rule. For CR (LP)-s and SA (LP)-s, we applied Remark 2.4 to (119) and obtain its solution by sorting \mbox{\boldmathg}_{V}. The second column is , the number of genotype candidates, while the third column is . The fourth and fifth columns are the objective value \mbox{\boldmathg}^{T}\mbox{\boldmathx} and the value \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx} of each algorithm. For the CR rows, these two values were evaluated at \mbox{\boldmathx}^{*}, the solution at Step 1 of Algorithm 4.2, and for the SA rows, they were computed with the output solution \mbox{\boldmathx}^{h} at Step 4. The sixth column is the iteration number of Algorithm 4.2. The seventh column is the value of f_{\lambda}(\mbox{\boldmathx}^{h}), where is the iteration number indicated in the sixth column. For the CR row, note that \mbox{\boldmathx}^{*} must satisfy the quadratic constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta, but \mbox{\boldmathx}^{0} does not always satisfy it. In contrast, for the SA rows, f_{\lambda}(\mbox{\boldmathx}^{h}) is given at the final solution of Step 4. The last column is the computation time in seconds. Since SA (LP) uses the result of CR (LP), the computation time of SA (LP) is the sum of the computation time of CR (LP) and the steep-ascent method. In a similar way, SA (SOCP) and SA (SDP) are also the sums.
For the smallest case and , the SDP relaxation problem attains a remarkable result. Since the solution in SA (SDP) satisfies the constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta, this solution is a feasible solution of the original ED problem (11). In addition, it holds that from Lemma 2.3. Therefore, we know and we obtain the optimal value of the ED problem up to an error . Since this error is much better than the theoretical bounds discussed in Lemma 3.2, the combination of the SDP relaxation and the steepest-ascent method performs very well in this case.
In addition, we can make sure that the objective values of CR (LP) and CR (LP)-s are same, and this indicates that Assumption 2.2 holds in the numerical tests. As a result, we can obtain the solution of the LP relaxation problem (119) without solving it as an LP problem, as noted in Remark 2.4.
From Tables 2 and 3, we observe in the cases that from the values of \mbox{\boldmathg}^{T}\mbox{\boldmathx} in the CR (LP), CR (SOCP), and CR (SDP) rows. This result supports the validity of Lemma 2.3. However, we also observe for large instances that \mbox{\boldmathg}^{T}\mbox{\boldmathx} of CR (SDP) is lower than that of CR (SOCP). A principal reason of this inconsistent phenomenon is a premature termination of SDPT-3. These inaccurate values of CR (SDP) were mainly caused by the lack of interior-feasible points in (85); see Remark 2.5. The SOCP relaxation problem (135) provides highly numerical stability compared to the SDP relaxation problem (85). This can be regarded as an advantage of the SOCP relaxation approach.
As a next viewpoints, the violations of the solution generated by the steep-ascent method against \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta are remarkably small. This is mainly because we set large enough based on the Lagrangian multiplier . Therefore, the maximization of the function with the penalty term (174) can provide a suitable solution for the ED problem (11).
A comparison of the results of the steepest-ascent methods that starts from the three conic relaxation indicates that if SDPT-3 obtained sensible solutions (), the output objective values \mbox{\boldmathg}^{T}\mbox{\boldmathx} of SA (SDP) and SA (SOCP) were close to each other, but much higher than that of SA (LP). This implies that the SOCP relaxation problem and the SDP relaxation problem provided good starting points for the steepest-ascent method. This point is also indicated in the iteration number of the steepest-ascent methods. The iteration numbers of SA (SDP) and SA (SOCP) are much less than that of SA (LP). For example, when and , SA (SDP) and SA (SOCP) required only two and three iterations, respectively, while SA (LP) required 65 iterations. Therefore, we can infer that the solutions of the SDP relaxation problem and the SOCP relaxation problem are close to local maximizer of f_{\lambda}(\mbox{\boldmathx}).
When we move our focus from the solution quality to the computation time, the computation of SA (SOCP) is much shorter than SA (SDP). This was mainly because we can aggressively exploit the structure of the Wright numerator matrix through the SOCP approach discussed in [47]. Since the objective values \mbox{\boldmathg}^{T}\mbox{\boldmathx} of SA (SOCP) and SA (SDP) are competitive, SA (SOCP) can be considered as the most efficient approach among the three SA (LP), SA (SOCP), and SA (SDP).
Judging from these results, we chose the SOCP relaxation problem for generating the starting point when we compare Algorithm 4.2 against the existing approaches, GENCONT, OPSEL, and CPLEX. Since OPSEL and CPLEX utilized the brand-and-bound framework, we can expect their solution are close to the real optimal solution of the ED problem (11).
Tables 4 and 5 present the comparison of Algorithm 4.2, GENCONT and OPSEL (Version 2.0), CPLEX (Version 12.62) for the numbers of selected genotypes and . We tried to execute GENCONT2, a new version of GENCONT, but its binary file did not work on our computational environment. Therefore, we used GENCONT1 for the comparison. In the tables, we evaluate f_{\lambda}(\mbox{\boldmathx}) in seventh column at the solution obtained from each algorithm and the eighth column reports the number of chosen genotypes . For OPSEL and CPLEX, we set the time limit as three hours and the tolerance gap as . When OPSEL and CPLEX reached the time limit, we obtained a feasible solution from OPSEL, but we could not extract sensible solutions from CPLEX. GENCONT could not solve large instances ( and ) due to out of memory.
From the numerical results in Tables 4 and 5, the objective values of SA (SOCP) are close to those of GENCONT, OPSEL, and CPLEX. Since the cost vector in the objective function is usually generated from a statistical procedure, the discrepancy in the objective values make a little difference for practical use. However, GENCONT sometimes failed to satisfy the constraints; the quadratic constraint \mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx}\leq 2\theta was violated in the or cases, and the number of the chosen genotypes did not match the input . Therefore, the quality of SA (SOCP) was superior to that of GENCONT.
From the viewpoint of the computation time, of SA (SOCP) is much faster than GENCONT. In particular, for the case , SA (SOCP) used less than one seconds, while GENCONT required 1700 seconds. The computation times of the branch-and-bound framework were unpredictable. In , OPSEL and CPLEX required longer computation for a small problem than for a large problem . It is very difficult to estimate the computation time required by OPSEL and CPLEX in advance due to a nature of the branch-and-bound framework. In contrast, SA (SOCP) consumed longer computation time for larger problems and this property is favorable for practical use.
6 Conclusion and Future Directions
In this paper, we introduced the conic relaxation approach based on LP, SOCP, and SDP for the special-case ED selection problem that is commonly encountered in tree breeding. We discussed the strength of the three conic relaxation problems, and gave the theoretical bounds of the randomized algorithm that uses the SDP relaxation problem. The fact that the theoretical bounds are not so sharp motivated us to implement the steep-ascent method so that we can acquire a suitable solution for practical usage. From the numerical results, we found that the steep-ascent method with the SOCP relaxation was the most effective among the three conic relaxation approaches, and that this outperformed the existing methods, in particular, from the viewpoint of computation time.
One of further directions is to find a better theoretical bounds for the conic relaxation problems. In the discussions of this paper, we mainly relied on the positive semidefiniteness of and non-negativity of the Wright numerator matrix . Since the specific values of the elements in this matrix strongly relate to the pedigree of the candidate pool, there is a possibility that we exploit such structures to tighten the theoretical bounds discussed in Lemma 3.2. However, we also need to reduce the computation time of the SDP relaxation problem to make the SDP approach effective.
Another research direction is to minimize inbreeding depression [18]. The objective function there is of form (1-(\mbox{ID})\mbox{\boldmathx}^{T}\mbox{\boldmathA}\mbox{\boldmathx})\mbox{\boldmathg}^{T}\mbox{\boldmathx}, where ID is a constant that represents a regression slope. Since the function is a cubic function with respect to the contribution , this function is not even a convex function. From the numerical results in this paper, we expect that a similar method to the steep-ascent method may perform well for solving such a problem. The minimization of the inbreeding depression will be an interesting problem to researchers in the optimization field.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J. Ahlinder, T. Mullin, and M. Yamashita. Using semidefinite programming to optimize unequal deployment of genotypes to a clonal seed orchard. Tree genetics & genomes , 10(1):27–34, 2014.
- 2[2] F. Alizadeh and D. Goldfarb. Second-order cone programming. Mathematical programming , 95(1):3–51, 2003.
- 3[3] A. Ben-Tal and A. Nemirovski. On polyhedral approximations of the second-order cone. Mathematics of Operations Research , 26(2):193–205, 2001.
- 4[4] H. Y. Benson and U. Saglam. Mixed-integer second-order cone programming: A survey. Tutorials in Operations Research , pages 13–36, 2013.
- 5[5] L. Bomosssoul and D. Lindgren. Optimal utilization of clones and genetic thinning of ‘seed orchards. Silvae Genetica , 42:4–5, 1993.
- 6[6] D. Charlesworth and B. Charlesworth. Inbreeding depression and its evolutionary consequences. Annual review of ecology and systematics , 18(1):237–268, 1987.
- 7[7] A. Domahidi, E. Chu, and S. Boyd. ECOS: An SOCP solver for embedded systems. In Proceedings of European Control Conference , pages 3071–3076, 2013.
- 8[8] S. Drewes and S. Ulbrich. Subgradient based outer approximation for mixed integer second order cone programming. In Mixed Integer Nonlinear Programming , pages 41–59. Springer, 2012.
