An efficient global optimization algorithm for maximizing the sum of two   generalized Rayleigh quotients

Xiaohui Wang; Longfei Wang; Yong Xia

arXiv:1706.00596·math.OC·January 8, 2018

An efficient global optimization algorithm for maximizing the sum of two generalized Rayleigh quotients

Xiaohui Wang, Longfei Wang, Yong Xia

PDF

Open Access

TL;DR

This paper introduces a branch-and-bound algorithm for globally maximizing the sum of two generalized Rayleigh quotients, reformulating the problem into a one-dimensional optimization with SDP subproblems, and demonstrating superior efficiency over existing heuristics.

Contribution

The paper presents a novel branch-and-bound method that explicitly overestimates the objective using dual SDP subproblems for efficient global optimization.

Findings

01

The proposed algorithm outperforms recent SDP-based heuristics in efficiency.

02

Reformulation reduces the problem to a one-dimensional optimization with SDP subproblems.

03

Numerical results confirm the effectiveness of the method.

Abstract

Maximizing the sum of two generalized Rayleigh quotients (SRQ) can be reformulated as a one-dimensional optimization problem, where the function value evaluations are reduced to solving semi-definite programming (SDP) subproblems. In this paper, we first use the dual SDP subproblem to construct an explicit overestimation and then propose a branch-and-bound algorithm to globally solve (SRQ). Numerical results demonstrate that it is even more efficient than the recent SDP-based heuristic algorithm.

Tables1

Table 1. Table 1: The average of the numerical results for ten times solving (P) with different n 𝑛 n .

n	“two-stage” algorithm NRX		Our new algorithm
n	time(s)	iter.	time(s)	iter.
30	58.84	233.6	11.93	50.1
50	98.19	320.8	16.80	58.6
80	192.09	400.9	31.59	68.7
100	299.23	459.3	44.08	71.4
120	493.83	536.9	62.52	71.3
150	915.29	609.4	108.95	75.8
180	1519.09	634.0	186.84	81.2
200	2118.18	672.2	262.78	86.6

Equations63

(SRQ) x \neq = 0 max \frac{x ^{T} B x}{x ^{T} W x} + \frac{x ^{T} D x}{x ^{T} V x}

(SRQ) x \neq = 0 max \frac{x ^{T} B x}{x ^{T} W x} + \frac{x ^{T} D x}{x ^{T} V x}

{\rm(P)}~{}~{}\begin{array}[]{lll}&\max_{x\in\mathbb{R}^{n}}&f(x)=\dfrac{x^{T}Bx}{x^{T}Wx}+x^{T}Dx\\ &{\rm s.t.}&\|x\|=1,\end{array}

{\rm(P)}~{}~{}\begin{array}[]{lll}&\max_{x\in\mathbb{R}^{n}}&f(x)=\dfrac{x^{T}Bx}{x^{T}Wx}+x^{T}Dx\\ &{\rm s.t.}&\|x\|=1,\end{array}

(P_{1}) μ \in [\underline{μ}, \overset{μ}{ˉ}] max q (μ) := μ + g (μ),

(P_{1}) μ \in [\underline{μ}, \overset{μ}{ˉ}] max q (μ) := μ + g (μ),

\begin{array}[]{lll}g(\mu)=&\max_{x\in\mathbb{R}^{n}}&x^{T}Dx\\ &{\rm s.t.}&\|x\|=1\\ &&x^{T}(B-\mu W)x\geq 0\end{array}

\begin{array}[]{lll}g(\mu)=&\max_{x\in\mathbb{R}^{n}}&x^{T}Dx\\ &{\rm s.t.}&\|x\|=1\\ &&x^{T}(B-\mu W)x\geq 0\end{array}

\underline{μ} = ∥ x ∥ = 1 min \frac{x ^{T} B x}{x ^{T} W x}, \overset{μ}{ˉ} = ∥ x ∥ = 1 max \frac{x ^{T} B x}{x ^{T} W x}

\underline{μ} = ∥ x ∥ = 1 min \frac{x ^{T} B x}{x ^{T} W x}, \overset{μ}{ˉ} = ∥ x ∥ = 1 max \frac{x ^{T} B x}{x ^{T} W x}

(SDP_{μ})

(SDP_{μ})

(SD_{μ})

(SD_{μ})

μ < \overset{μ}{ˉ},

μ < \overset{μ}{ˉ},

∥ x ∥ = 1, (B - μ^{*} W) x = 0,

∥ x ∥ = 1, (B - μ^{*} W) x = 0,

g (μ^{*}) = ∥ x ∥ = 1 max x^{T} (D - η^{*} (B - μ^{*} W)) x = λ_{m a x} (D - η^{*} (B - μ^{*} W)) .

g (μ^{*}) = ∥ x ∥ = 1 max x^{T} (D - η^{*} (B - μ^{*} W)) x = λ_{m a x} (D - η^{*} (B - μ^{*} W)) .

q (μ_{i}) = μ_{i} + ν_{i}, q (μ_{i + 1}) = μ_{i + 1} + ν_{i + 1} .

q (μ_{i}) = μ_{i} + ν_{i}, q (μ_{i + 1}) = μ_{i + 1} + ν_{i + 1} .

q (μ)

q (μ)

:=

q (μ) \leq q (μ_{i + 1}) + μ - μ_{i + 1} + η_{i + 1} (μ_{i + 1} - μ) λ_{m a x} (W) := q_{2} (μ) .

q (μ) \leq q (μ_{i + 1}) + μ - μ_{i + 1} + η_{i + 1} (μ_{i + 1} - μ) λ_{m a x} (W) := q_{2} (μ) .

\overset{q}{ˉ} (μ) = min {q_{1} (μ), q_{2} (μ)},

\overset{q}{ˉ} (μ) = min {q_{1} (μ), q_{2} (μ)},

U_{i} = μ \in [μ_{i}, μ_{i + 1}] max \overset{q}{ˉ} (μ) .

U_{i} = μ \in [μ_{i}, μ_{i + 1}] max \overset{q}{ˉ} (μ) .

\displaystyle U_{i}=\left\{\begin{array}[]{ll}q(\mu_{i}),&{\rm if}~{}\eta_{i}\lambda_{\min}(W)\geq 1\\ q(\mu_{i+1}),&{\rm if}~{}\eta_{i+1}\lambda_{\max}(W)\leq 1\\ q_{1}(\mu_{0}),&{\rm otherwise,}\\ \end{array}\right.

\displaystyle U_{i}=\left\{\begin{array}[]{ll}q(\mu_{i}),&{\rm if}~{}\eta_{i}\lambda_{\min}(W)\geq 1\\ q(\mu_{i+1}),&{\rm if}~{}\eta_{i+1}\lambda_{\max}(W)\leq 1\\ q_{1}(\mu_{0}),&{\rm otherwise,}\\ \end{array}\right.

μ_{0} = \frac{q ( μ _{i + 1} ) - μ _{i + 1} + η _{i + 1} μ _{i + 1} λ _{m a x} ( W ) - q ( μ _{i} ) + μ _{i} - η _{i} μ _{i} λ _{m i n} ( W )}{η _{i + 1} λ _{m a x} ( W ) - η _{i} λ _{m i n} ( W )} .

μ_{0} = \frac{q ( μ _{i + 1} ) - μ _{i + 1} + η _{i + 1} μ _{i + 1} λ _{m a x} ( W ) - q ( μ _{i} ) + μ _{i} - η _{i} μ _{i} λ _{m i n} ( W )}{η _{i + 1} λ _{m a x} ( W ) - η _{i} λ _{m i n} ( W )} .

q (μ) \leq q (μ_{i}) + μ - μ_{i} .

q (μ) \leq q (μ_{i}) + μ - μ_{i} .

\underline{μ} = μ_{1} < \dots < μ_{k + 1} = \overset{μ}{ˉ} .

\underline{μ} = μ_{1} < \dots < μ_{k + 1} = \overset{μ}{ˉ} .

U B_{1} = μ \in [μ_{1}, \tilde{μ}] max \overset{q}{ˉ} (μ), U B_{2} = μ \in [\tilde{μ}, μ_{2}] max \overset{q}{ˉ} (μ) .

U B_{1} = μ \in [μ_{1}, \tilde{μ}] max \overset{q}{ˉ} (μ), U B_{2} = μ \in [\tilde{μ}, μ_{2}] max \overset{q}{ˉ} (μ) .

v (P_{1}) \geq q (μ^{*}) \geq v (P_{1}) - ϵ .

v (P_{1}) \geq q (μ^{*}) \geq v (P_{1}) - ϵ .

\overset{μ}{ˉ} - ϵ \leq \underline{μ},

\overset{μ}{ˉ} - ϵ \leq \underline{μ},

q (μ) \leq q (\underline{μ}) + μ - \underline{μ} \leq q (\underline{μ}) + \overset{μ}{ˉ} - \underline{μ} \leq q (\underline{μ}) + ϵ .

q (μ) \leq q (\underline{μ}) + μ - \underline{μ} \leq q (\underline{μ}) + \overset{μ}{ˉ} - \underline{μ} \leq q (\underline{μ}) + ϵ .

q (μ^{*}) = q (\underline{μ}) \geq μ \in [\underline{μ}, \overset{μ}{ˉ}] max q (μ) - ϵ = v (P_{1}) - ϵ .

q (μ^{*}) = q (\underline{μ}) \geq μ \in [\underline{μ}, \overset{μ}{ˉ}] max q (μ) - ϵ = v (P_{1}) - ϵ .

U B \leq q (μ_{1}) + μ_{2} - μ_{1} .

U B \leq q (μ_{1}) + μ_{2} - μ_{1} .

μ_{2} - μ_{1} \leq ϵ .

μ_{2} - μ_{1} \leq ϵ .

U B^{*} \leq q (μ^{*}) + ϵ .

U B^{*} \leq q (μ^{*}) + ϵ .

q (μ^{*}) \geq v (P_{1}) - ϵ .

q (μ^{*}) \geq v (P_{1}) - ϵ .

q (μ) \leq q (\overset{μ}{^}) + μ - \overset{μ}{^} \leq q (\overset{μ}{^}) + \overset{μ}{ˉ} - \overset{μ}{^} = q (\overset{μ}{^}) + ϵ .

q (μ) \leq q (\overset{μ}{^}) + μ - \overset{μ}{^} \leq q (\overset{μ}{^}) + \overset{μ}{ˉ} - \overset{μ}{^} = q (\overset{μ}{^}) + ϵ .

v (P_{1})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optimization Algorithms Research · Polynomial and algebraic computation · graph theory and CDMA systems

Full text

∎

11institutetext: X. Wang 22institutetext: School of Astronautics, Beihang University, Beijing, 100191, P. R. China 22email: [email protected] 33institutetext: L.F. Wang 44institutetext: Y. Xia 66institutetext: State Key Laboratory of Software Development Environment, LMIB of the Ministry of Education, School of Mathematics and System Sciences, Beihang University, Beijing 100191, P. R. China 66email: [email protected] (L.F. Wang); [email protected] (Y. Xia)

An efficient global optimization algorithm for maximizing the sum of two generalized Rayleigh quotients

††thanks: This research was supported by National Natural Science Foundation of China under grants 11471325 and 11571029.

Xiaohui Wang

Longfei Wang

Yong Xia

(Received: date / Accepted: date)

Abstract

Maximizing the sum of two generalized Rayleigh quotients (SRQ) can be reformulated as a one-dimensional optimization problem, where the function value evaluations are reduced to solving semi-definite programming (SDP) subproblems. In this paper, we first use the optimal value of the dual SDP subproblem to construct a new saw-tooth-type overestimation. Then, we propose an efficient branch-and-bound algorithm to globally solve (SRQ), which is shown to find an $\epsilon$ -approximation optimal solution of (SRQ) in at most O $\left(\frac{1}{\epsilon}\right)$ iterations. Numerical results demonstrate that it is even more efficient than the recent SDP-based heuristic algorithm.

Keywords:

: fractional programming, Rayleigh quotient, semidefinite programming, branch and bound.

MSC:

90C32 90C26 90C22

1 Introduction

The problem of maximizing the sum of two generalized Rayleigh quotients

[TABLE]

with positive definite matrices $W$ and $V$ , has recent applications in the multi-user MIMO system PG and the sparse Fisher discriminant analysis in pattern recognition DFB ; FM ; WZW . Without loss of generality, we can assume that $V$ is identity. Otherwise, we reformulate (1) as a problem in terms of $y$ by substituting $x=V^{-\frac{1}{2}}y$ . Moreover, since the objective function in (1) is homogeneous, (SRQ) can be further recast as the following sphere-constrained optimization problem, which is first proposed by Zhang Hong ; HZ :

[TABLE]

where $\|\cdot\|$ denotes the $\ell_{2}$ -norm throughout this paper.

The single generalized Rayleigh quotient optimization problem (i.e., (SRQ) with $B=0$ ) is related to the classical eigenvalue problem and solved in polynomial time ZYL . However, to our best knowledge, whether the general (SRQ) (or (P)) can be efficiently solved in polynomial time remains open. Actually, as shown in [Hong , Example 1.1], there could exist a few local non-global maximizers of (P). Moreover, even finding the critical point of (P) is nontrivial, see Hong ; HZ .

Recently, (P) is reformulated as the problem of maximizing the following one-dimensional function NRX :

[TABLE]

where $g(\mu)$ is related to a non-convex quadratic optimization:

[TABLE]

and the lower and upper bounds

[TABLE]

are the smallest and the largest generalized eigenvalues of the matrix pencil $(B,W),$ respectively. In order to solve the one-dimensional problem (2), a “two-stage” heuristic algorithm is proposed in NRX by first subdividing $[\underline{\mu},\bar{\mu}]$ into coarse intervals such that each one contains a local maximizer of $q(\mu)$ and then applying the quadratic fit line search An ; Baz ; Lu in each interval. For any given $\mu$ , $g(\mu)$ (or $q(\mu)$ ) can be evaluated by solving an equivalent semi-definite programming (SDP) formulation, according to an extended version of S-Lemma in [Poly , Proposition 4.1, see also [D , Theorem 5.17]]. Finally, for the returned optimal solution $\mu^{*}$ , the optimal vector solution of (P) is recovered by a rank-one decomposition procedure [NRX , Theorem 3]. Though this “two-stage” algorithm could find the global solutions of the tested examples, it is still a heuristic algorithm since the function $q(\mu)$ is not guaranteed to be quasi-concave. Besides, there is no meaningful stopping criterion for the “two-stage” algorithm. That is, we cannot estimate the gap between the obtained solution and the global maximizer of (P1).

In this paper, we propose an easy-to-evaluate function for upper bounding $q(\mu)$ . It provides saw-tooth-curve upper bounds of $q(\mu)$ over $[\underline{\mu},\bar{\mu}]$ , which are used to establish an efficient branch-and-bound algorithm. We further show that the new algorithm returns an $\epsilon$ -approximation optimal solution of (P1) in at most $O\left(\frac{1}{\epsilon}\right)$ iterations. Numerical results show that the new algorithm is even much more efficient than the “two-stage” heuristic algorithm NRX .

The remainder of this paper is organized as follows. In Section 2, we give some preliminaries on the evaluation of $g(\mu)$ . In Section 3, we propose an easy-to-compute upper bounding function, which provides saw-tooth-curve upper bounds of $g(\mu)$ . In Section 4, we establish a new branch-and-bound algorithm and estimate the worst-case computational complexity. In Section 5, we do numerical comparison experiments, which demonstrate the efficiency of our new algorithm. Conclusions are made in Section 6.

Throughout the paper, $v(\cdot)$ denotes the optimal objective value of the problem $(\cdot)$ . We use $A\succeq(\preceq)0$ to stand for a positive (negative) semi-definite matrix $A$ . The positive definite matrix $A$ is denoted by $A\succ 0$ . Let $\lambda_{\max}(A)$ and $\lambda_{\min}(A)$ be the maximal and minimal eigenvalue of $A$ , respectively. The inner product of two matrices $A$ and $B$ is denoted by $A\bullet B=$ trace $(AB^{T})$ . For a real number $a$ , $\lfloor a\rfloor$ returns the largest integer less than or equal to $a$ .

2 Preliminaries

In the section, we first show how to evaluate $g(\mu)$ . Then, we present the “two-stage” algorithm NRX to maximize $q(\mu)$ (2). Finally, we discuss how to get the optimal vector solution of (P) from the maximizer of $q(\mu)$ .

Lifting $xx^{T}$ to $X\in\mathbb{R}^{n\times n}$ (since $x^{T}Ax=A\bullet(xx^{T})$ ) yields the primal SDP relaxation of the optimization problem of evaluating $(g_{\mu})$ for any given $\mu$ :

[TABLE]

The conic dual problem of $({\rm SDP}_{\mu})$ is

[TABLE]

which coincides with the Lagrangian dual problem of $g({\mu})$ .

It is trivial to see that $({\rm SD}_{\mu})$ has an interior feasible solution, i.e., the Slater’s condition holds. We can verify that, for any $\mu$ satisfying

[TABLE]

the Slater’s condition holds for $({\rm SDP}_{\mu})$ , i.e., there is an $X\succ 0$ such that $I\bullet X=1$ and $(B-\mu W)\bullet X>0$ . Therefore, under the assumption (5), strong duality holds for $({\rm SDP}_{\mu})$ , that is, $v({\rm SDP}_{\mu})=v({\rm SD}_{\mu})$ and both optimal values are attained.

Under the assumption (5), by further applying the extended version of S-Lemma in [Poly , Proposition 4.1, see also [D , Theorem 5.17]], we can show that the strong duality holds for the optimization problem of evaluating $g({\mu})$ , i.e., $g({\mu})=v({\rm SD}_{\mu})$ . For more details, we refer to NRX .

Next, we present the “two-stage” algorithm proposed in NRX for solving (2). Firstly, it partitions $[\underline{\mu},\bar{\mu}]$ into a rather coarse mesh and then collects all subintervals containing an interior local maximizer. In the second stage, the quadratic fit method Baz ; An ; Lu is applied to find a corresponding local maximizer in each subinterval that has been collected in the first stage. Finally, the optimal solution $\mu^{*}$ is selected from all these obtained local maximizers. In this paper, we will not present the detailed quadratic fit line search subroutine, which can be found in NRX . One of the reason is that the algorithm in the first stage is already quite time-consuming.

The “two-stage” scheme proposed in NRX

- Step 1.

Given $\delta>0.$ Let $\mu_{0}=\underline{\mu}$ and $\mu_{i}=\underline{\mu}+(i-1)\delta$ for $i=1,2,\ldots,\lfloor\frac{\bar{\mu}-\underline{\mu}}{\delta}\rfloor+1$ . If $\frac{\bar{\mu}-\underline{\mu}}{\delta}$ is not an integer, set $\mu_{k}=\bar{\mu}$ for $k=\lfloor\frac{\bar{\mu}-\underline{\mu}}{\delta}\rfloor+2$ .

Step 2.

For $i=1,2,\ldots,$ collect all the three-point pattern $[\mu_{i-1},\mu_{i},\mu_{i+1}]$ such that $\max\{q(\mu_{i-1}),q(\mu_{i+1})\}\leq q(\mu_{i})$ .

Step 3.

Call the quadratic fit line search subroutine (with a smaller tolerance than $\delta$ ) to find a corresponding local maximizer in each three-point pattern $[\mu_{i-1},\mu_{i},\mu_{i+1}]$ .

Step 4.

Select the best maximizer $\mu^{*}$ among $\underline{\mu}$ , $\bar{\mu}$ , and all the local maximizers found in Step 3.

Suppose (2) is solved, let $\mu^{*}$ be the returned maximizer. If $\mu^{*}=\bar{\mu}$ , the feasible region of (3) is reduced to

[TABLE]

which contains only the unit eigenvector corresponding to the maximal eigenvalue. In this case, $g(\mu^{*})$ is actually a maximum eigenvalue problem. On the other hand, suppose $\mu^{*}<\bar{\mu}$ , the optimal vector solution of (P) is recovered from the equivalent $({\rm SDP}_{\mu^{*}})$ based on the rank one constraint, by using a rank-one procedure similar to that in SZ ; Y , see details in NRX .

There is an alternative approach to recover the optimal solution of (P). Let $(\nu^{*},\eta^{*})$ be the optimal solution of the dual problem $({\rm SD}_{\mu^{*}})$ . It is not difficult to verify that

[TABLE]

Consequently, the optimal vector solution of (P) is the unit eigenvector corresponding to the maximum eigenvalue of $D-\eta^{*}(B-\mu^{*}W)$ .

3 Saw-tooth upper bounds

In this section, we propose an easy-to-evaluate upper bounding function, which provides saw-tooth upper bounds for $q(\mu)$ over $[\underline{\mu},\bar{\mu}]$ .

Let $\cup_{i=1}^{k}[\mu_{i},\mu_{i+1}]$ be a partition of $[\underline{\mu},\bar{\mu}]$ , where $\mu_{1}=\underline{\mu}$ and $\mu_{k+1}=\bar{\mu}$ .

Consider the interval $[\mu_{i},\mu_{i+1}]$ with $i\leq k-1$ (so that $\mu_{i+1}<\bar{\mu}$ ). Solve $({\rm SD}_{\mu})$ with $\mu=\mu_{i},\mu_{i+1}$ and denote the optimal solutions by $(\nu_{i},\eta_{i})$ and $(\nu_{i+1},\eta_{i+1})$ , respectively. Then, we have $\eta_{i}\geq 0$ , $\eta_{i+1}\geq 0$ , and

[TABLE]

For any $\mu\in[\mu_{i},\mu_{i+1}]$ , it follows from the strong duality that

[TABLE]

Similarly, we have

[TABLE]

Now, we obtain an upper bounding function of $q(\mu)$ over $[\mu_{i},\mu_{i+1}]$ :

[TABLE]

which is a concave function as $q_{1}(\mu)$ and $q_{2}(\mu)$ are both linear functions. It provides the following upper bound of $q(\mu)$ over $[\mu_{i},\mu_{i+1}]$ :

[TABLE]

Problem (10) is a convex program. Moreover, it has a closed-form solution.

Theorem 1

Under the assumption $\mu_{i+1}<\bar{\mu}$ , an upper bound of $q(\mu)$ over $[\mu_{i},\mu_{i+1}]$ is given by

[TABLE]

where

[TABLE]

Proof

The trivial proof is omitted as both $q_{1}(\mu)$ and $q_{2}(\mu)$ are linear functions and $\mu_{0}$ is the unique solution of the equation $q_{1}(\mu)=q_{2}(\mu)$ .

Finally, we also have a simple estimation of the upper bound $U_{i}$ .

Theorem 2

For any $\mu\geq\mu_{i}$ , we have

[TABLE]

Proof

The inequality (15) follows from the definition $q_{1}(\mu)$ (7) and the facts that $\eta_{i}\geq 0$ and $\lambda_{\min}(W)>0$ (as $W\succ 0$ ).

Remark 1

The estimation (15) is independent of $\mu_{i+1}$ . Therefore, it can be satisfied for the extended case $\mu_{i+1}=\bar{\mu}$ .

4 A saw-tooth branch-and-bound algorithm

In this section, we first propose a branch-and-bound algorithm based on the new saw-tooth-curve upper bounds and then establish the worst-case computational complexity of the new algorithm.

Our algorithm works on a list

[TABLE]

The initial list is $\underline{\mu}=\mu_{1}<\mu_{2}=\bar{\mu}$ . In each iteration, we first select the interval $[\mu_{i},\mu_{i+1}]$ from the $\{\mu\}$ -list that provides the maximal upper bound $U_{i}$ (10). Then, we insert the mid-point $\frac{\mu_{i}+\mu_{i+1}}{2}$ into the $\{\mu\}$ -list (16) and increase $k$ by one. The process is repeated until the stopping criterion is reached. The detailed algorithm is presented as follows.

The saw-tooth branch-and-bound algorithm

- Step 0.

Given the approximation error $\epsilon>0$ . Compute $\underline{\mu}$ , $\bar{\mu}$ (4), $\lambda_{\min}(W)$ and $\lambda_{\max}(W)$ . Initialize the iteration number $k=1$ .

Let $\mu_{1}=\underline{\mu}$ . Solve ( ${\rm SD}_{\mu_{1}}$ ) to obtain the optimal solution $(\nu_{1},\eta_{1})$ . Then, $q(\mu_{1})=\mu_{1}+\nu_{1}$ and let $LB=q(\mu_{1})$ , $\mu^{*}=\mu_{1}$ .

Let $\mu_{2}=\bar{\mu}-\epsilon$ . If $\mu_{2}\leq\underline{\mu}$ , stop and return $\mu^{*}$ as an approximate maximizer. Otherwise, solve ( ${\rm SD}_{\mu_{2}}$ ) to obtain the optimal solution $(\nu_{2},\eta_{2})$ . Then, $q(\mu_{2})=\mu_{2}+\nu_{2}$ . If $q(\mu_{2})>LB$ , update $LB=q(\mu_{2})$ and $\mu^{*}=\mu_{2}$ . Set $k=2$ and $S=\emptyset$ .

Step 1.

Let $\tilde{\mu}=\frac{1}{2}(\mu_{1}+\mu_{2})$ . Solve ( ${\rm SD}_{\tilde{\mu}}$ ) and obtain the optimal solution $(\tilde{\nu},\tilde{\eta})$ . Then, $q(\tilde{\mu})=\tilde{\mu}+\tilde{\nu}$ . If $q(\tilde{\mu})>LB$ , update $LB=q(\tilde{\mu})$ and $\mu^{*}=\tilde{\mu}$ .

Step 2.

According to Theorem 1, compute the upper bounds:

[TABLE]

Update $S=S\cup\{(UB_{1},\mu_{1},\tilde{\mu})\}\cup\{(UB_{2},\tilde{\mu},\mu_{2})\}$ and $k=k+1$ .

Step 3

Find $(UB^{*},\mu_{1},\mu_{2})=\arg\max\limits_{(t,*,*)\in S}t$ . If $UB^{*}\leq LB+\epsilon$ , stop and return $\mu^{*}$ as an approximate maximizer. Otherwise, update $S=S\setminus\{(UB^{*},\mu_{1},\mu_{2})\}$ and go to Step 1.

Theoretically, we can show that our new algorithm returns an $\epsilon$ -approximation optimal solution of (P1) in at most $O(\frac{1}{\epsilon})$ iterations. Here, we call $\mu^{*}$ an $\epsilon$ -approximation optimal solution of (P1) if it is feasible and satisfies

[TABLE]

Theorem 3

The above algorithm terminates in at most $\left\lceil\frac{\bar{\mu}-\underline{\mu}}{\epsilon}\right\rceil$ steps and returns an $\epsilon$ -approximation optimal solution of (P1).

Proof

If the algorithm terminates at Step 0, that is,

[TABLE]

then for any $\mu\in[\underline{\mu},\bar{\mu}]$ , it follows from the inequality (15) in Theorem 2 that

[TABLE]

Therefore, we have

[TABLE]

It follows that $\mu^{*}=\underline{\mu}$ is an $\epsilon$ -approximation optimal solution of (P1).

Now, we suppose that the algorithm does not terminate at Step 0. Consider $\{(UB,\mu_{1},\mu_{2})\}\in S$ in the $k$ -th iteration of the algorithm. If $UB<UB^{*}$ , then the interval $[\mu_{1},\mu_{2}]$ will be not selected to partition. In the following, we assume $UB=UB^{*}$ . According to the inequality (15) in Theorem 2, for any $\mu\in[\mu_{1},\mu_{2}]$ , we have

[TABLE]

Since $UB=UB^{*}$ and $q(\mu_{1})\leq LB$ , according to the stopping criterion, the algorithm terminates when

[TABLE]

Therefore, there are at most $\left\lceil\frac{\bar{\mu}-\underline{\mu}}{\epsilon}\right\rceil$ elements in $S$ . Since the number of elements of $S$ increases by one in each iteration, the algorithm stops in $\left\lceil\frac{\bar{\mu}-\underline{\mu}}{\epsilon}\right\rceil$ steps.

Let $\mu^{*}$ be the approximation solution returned by the algorithm. We have

[TABLE]

To show that $\mu^{*}$ is an $\epsilon$ -approximation optimal solution of (P1), it is sufficient to prove that

[TABLE]

Let $\hat{\mu}=\bar{\mu}-\epsilon>\underline{\mu}$ . According to the inequality (15) in Theorem 2, for any $\mu\in[\hat{\mu},\bar{\mu}]$ , we obtain

[TABLE]

Therefore, we have

[TABLE]

where the equality (19) follows from (17). Then, we obtain (18). The proof is complete.

5 Computational Experiments

We test the new branch-and-bound algorithm for solving (P1) on the same numerical examples as in NRX . The SDP subproblems $({\rm SD}_{\mu})$ are solved by SDPT3 within CVX Boyd . Since there is no unified stopping criterion in the “two-stage” heuristic algorithm NRX , we just report the number of function evaluations (i.e., solving the SDP subproblems) in the first stage, with the setting $\delta=0.05$ used in NRX . For our algorithm, we set $\epsilon=1e-5$ .

The first example is taken from [Hong , Example 3.2]. It has many local non-global maximizers.

Example 1

Let $B=\left(\begin{matrix}2.3969&0.4651&4.6392\\ 0.4651&5.4401&0.7838\\ 4.6392&0.7838&10.1741\end{matrix}\right),\\ W=\left(\begin{matrix}0.8077&0.8163&1.0970\\ 0.8163&4.1942&0.8457\\ 1.0970&0.8457&1.8810\end{matrix}\right),D=\left(\begin{matrix}3.9104&-0.9011&-2.0128\\ -0.9011&0.9636&0.6102\\ -2.0128&0.6102&1.0908\end{matrix}\right).$

In this case, $[\underline{\mu},\bar{\mu}]=[0.9882,6.7322]$ . The “two-stage” algorithm NRX gives an approximation solution $\mu^{*}=6.5952.$ The number of function evaluations in the first stage is $116$ . Our algorithm returns an $\epsilon$ -approximation optimal solution, $\mu^{*}=6.5952$ , in $141$ iterations.

The second example in NRX is taking from [Hong , Example 3.1], where the optimal solution of $({\rm P}_{1})$ is achieved at the right-hand side end-point $\bar{\mu}$ .

Example 2

$B={\rm diag}(1,9,2),W=D={\rm diag}(5,2,3).$ **

In this case, $[\underline{\mu},\bar{\mu}]=[0.2,4.5]$ . The number of function evaluations in the first stage of the “two-stage” algorithm NRX is $87$ . While our algorithm finds $\mu^{*}=4.5$ in $2$ iterations.

Example 3 (NRX , Example 3)

Let

[TABLE]

In this case, $[\underline{\mu},\bar{\mu}]=[-0.8241,6.0647].$ The “two-stage” algorithm NRX gives an approximation solution $\mu^{*}=5.8748.$ The number of function evaluations in the first stage is $139$ . Our algorithm returns an $\epsilon$ -approximation optimal solution, $\mu^{*}=5.8821$ , in $35$ iterations.

Example 4 (NRX , Example 4)

Let $n=10,B={\rm diag}(1,2,8,7,9,3,10,2,-1,6),\\ W={\rm diag}(9,8,7,6,5,4,3,2,1,10),D={\rm diag}(5,20,3,4,8,-1,0,6,32,10).$

The searching interval is $[\underline{\mu},\bar{\mu}]=[-1,3.3333].$ The optimal solution is the left-hand side end-point $-1$ . The number of function evaluations in the first stage is $88$ . Our algorithm returns an $\epsilon$ -approximation optimal solution, $\mu^{*}=-1$ , in $18$ iterations.

Example 5 (NRX , Example 5)

*Let $n=20,$

$B={\rm diag}(1,2,20,3,50,4,6,7,8,9,100,2,3,4,5,6,7,0,10,9);$

$W={\rm diag}(100,1,2,30,5,7,9,7,8,9,1,2,30,1,50,8,1,10,10,9);$

$D={\rm diag}(0,1000,20,2,5,6,7,9,50,3,4,5,100,5,2,200,4,5,9,21).$ *

The searching interval of this example is $[\underline{\mu},\bar{\mu}]=[0,100].$ The “two-stage” algorithm NRX gives an approximation solution $\mu^{*}=2.0029.$ The number of function evaluations in the first stage is $2001$ . Our algorithm returns an $\epsilon$ -approximation optimal solution, $\mu^{*}=1.9999$ , in $22$ iterations.

In addition to Examples 2-5 reported above, our algorithm highly outperforms the “two-stage” algorithm NRX . For Example 1, our algorithm is also competitive. Notice that our algorithm is an exact algorithm and the “two-stage” algorithm NRX is heuristic.

Finally, we test more examples where the data are chosen randomly as follows. Each component of the symmetric matrices $B$ and $D$ is uniformly distributed in $[-10,10]$ . We generate $W,V=LL^{T}+\delta I$ , where $L$ is a randomly generated lower bi-diagonal matrix with each nonzero element being uniformly distributed in $[-10,10]$ and $\delta>0$ is a constant number to guarantee the positive definiteness of $W$ and $V$ . For each dimension varying from $30$ to $200$ , we independently run the “two-stage” algorithm NRX and our new algorithm ten times and report in Table 1 the average numerical results including the time in seconds and the number of iterations. It follows from the limited numerical results that our new global optimization algorithm highly outperforms the “two-stage” heuristic algorithm.

6 Conclusions

The recent SDP-based heuristic algorithm for maximizing the sum of two generalized Rayleigh quotients (SRQ) is based on the one-dimensional parametric reformulation where each functional evaluation corresponds to solving a semi-definite programming (SDP) subproblem. In this paper, we propose an efficient branch-and-bound algorithm to globally solve (SRQ) based on the new-developed saw-tooth overestimating approach. It is shown to find an $\epsilon$ -approximation optimal solution of (SRQ) in at most O $\left(\frac{1}{\epsilon}\right)$ iterations. Numerical results demonstrate that it is much more efficient than the recent SDP-based heuristic algorithm.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Antoniou, A., Lu, W. S.: Practical optimization: algorithms and engineering applications, Springer Science+ Business Media, LLC ( 2007 ) 2007 (2007)
2(2) Bazaraa, M.S., Sherali, H.D., Shetty, C. M.: Nonliear programming: theory and algorithms, Third Edition. John Wiley and Sons, Inc., Hoboken, New Jersey ( 2006 ) 2006 (2006)
3(3) Dundar, M.M., Fung, G., Bi, J., Sandilya, S., Rao, B.: Sparse fisher discriminant analysis for computer aided detection. Proceedings of SIAM International Conference on Data Mining ( 2005 ) 2005 (2005)
4(4) Fung, E., Michael, K. Ng.: On sparse fisher discriminant method for microarray data analysis. Bioinformation 2 , 230 − 234 ( 2007 ) 2 230 234 2007 2,230-234(2007)
5(5) Grant, M., Boyd, S.: CVX: MATLAB software for disciplined convex programming, Version 2.1, http://cvxr.com/cvx, ( 2015 ) 2015 (2015) .
6(6) Luenberger, D. G., Ye, Y.: Linear and nonlinear programming, Third Edition. Springer Science+Business Media, LLC. ( 2008 ) 2008 (2008)
7(7) Nguyen, V.B., Sheu, R.L., Xia Y.: Maximizing the sum of a generalized Rayleigh quotient and another Rayleigh quotient on the unit sphere via semidefinite programming, J. Glob. Optim., 64 ( 2 ) , 399 − 416 ( 2016 ) 64 2 399 416 2016 64(2),399-416(2016)
8(8) Pólik, I., Terlaky, T.: A servey of S-lemma. SIAM review. 49 ( 3 ) , 371 − 418 ( 2007 ) 49 3 371 418 2007 49(3),371-418(2007)

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

An efficient global optimization algorithm for maximizing the sum of two generalized Rayleigh quotients

Abstract

Keywords:

MSC:

1 Introduction

2 Preliminaries

3 Saw-tooth upper bounds

Theorem 1

Proof

Theorem 2

Proof

Remark 1

4 A saw-tooth branch-and-bound algorithm

Theorem 3

Proof

5 Computational Experiments

Example 1

Example 2

Example 3** (NRX , Example 3)**

Example 4** (NRX , Example 4)**

Example 5** (NRX , Example 5)**

6 Conclusions

Example 3 (NRX , Example 3)

Example 4 (NRX , Example 4)

Example 5 (NRX , Example 5)