On the generalized low rank approximation of the correlation matrices   arising in the asset portfolio

Xuefeng Duan; Jianchao Bai; Maojun Zhang; Xinjun Zhang

arXiv:1812.04228·math.NA·December 12, 2018

On the generalized low rank approximation of the correlation matrices arising in the asset portfolio

Xuefeng Duan, Jianchao Bai, Maojun Zhang, Xinjun Zhang

PDF

TL;DR

This paper introduces a new method for approximating correlation matrices in asset portfolios using a Gramian-based transformation and conjugate gradient optimization, demonstrating effectiveness through numerical examples.

Contribution

It characterizes the feasible set via Gramian and trigonometric transforms and converts the problem into an unconstrained optimization, solved efficiently with conjugate gradient methods.

Findings

01

Method is feasible and effective for correlation matrix approximation.

02

Transforms the problem into an unconstrained optimization for easier solving.

03

Numerical examples validate the approach's practicality.

Abstract

In this paper, we consider the generalized low rank approximation of the correlation matrices problem which arises in the asset portfolio. We first characterize the feasible set by using the Gramian representation together with a special trigonometric function transform, and then transform the generalized low rank approximation of the correlation matrices problem into an unconstrained optimization problem. Finally, we use the conjugate gradient algorithm with the strong Wolfe line search to solve the unconstrained optimization problem. Numerical examples show that our new method is feasible and effective.

Equations104

\frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} = Y \in S_{n}^{+}, d ia g (Y) = e, r ank (Y) \leq k min \frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} .

\frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} = Y \in S_{n}^{+}, d ia g (Y) = e, r ank (Y) \leq k min \frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} .

R^{(d)} = D^{(d)} C^{(d)} D^{(d)}

R^{(d)} = D^{(d)} C^{(d)} D^{(d)}

\frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} = Y \in S_{n}^{+}, d ia g (Y) = e min \frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} .

\frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} = Y \in S_{n}^{+}, d ia g (Y) = e min \frac{1}{2} d = 1 \sum m ∥ A^{(d)} - Y ∥_{F}^{2} .

Y \in S_{n}^{+}, r ank (Y) \leq k ⟺ λ_{k + 1} (Y) + \dots + λ_{n} (Y) = 0,

Y \in S_{n}^{+}, r ank (Y) \leq k ⟺ λ_{k + 1} (Y) + \dots + λ_{n} (Y) = 0,

S = {Y \in R^{n \times n} ∣ Y \in S_{n}^{+}, r ank (Y) \leq k} .

S = {Y \in R^{n \times n} ∣ Y \in S_{n}^{+}, r ank (Y) \leq k} .

Y = X X^{T}, X \in R^{n \times k} .

Y = X X^{T}, X \in R^{n \times k} .

Γ = {Y \in R^{n \times n} ∣ d ia g (Y) = e} .

Γ = {Y \in R^{n \times n} ∣ d ia g (Y) = e} .

X=[X_{1},X_{2},\cdots,X_{k}]=\left[\begin{array}[]{cccc}x_{11}&x_{12}&\cdots&x_{1k}\\ x_{21}&x_{22}&\cdots&x_{2k}\\ \vdots&\vdots&\ddots&\vdots\\ x_{n1}&x_{n2}&\cdots&x_{nk}\end{array}\right]\in R^{n\times k}.

X=[X_{1},X_{2},\cdots,X_{k}]=\left[\begin{array}[]{cccc}x_{11}&x_{12}&\cdots&x_{1k}\\ x_{21}&x_{22}&\cdots&x_{2k}\\ \vdots&\vdots&\ddots&\vdots\\ x_{n1}&x_{n2}&\cdots&x_{nk}\end{array}\right]\in R^{n\times k}.

X_{1}=\left[\begin{array}[]{c}\cos\alpha_{11}\\ \cos\alpha_{21}\\ \vdots\\ \cos\alpha_{n1}\end{array}\right],\ \ X_{2}=\left[\begin{array}[]{c}\cos\alpha_{12}\sin\alpha_{11}\\ \cos\alpha_{22}\sin\alpha_{21}\\ \vdots\\ \cos\alpha_{n2}\sin\alpha_{n1}\end{array}\right],\cdots,

X_{1}=\left[\begin{array}[]{c}\cos\alpha_{11}\\ \cos\alpha_{21}\\ \vdots\\ \cos\alpha_{n1}\end{array}\right],\ \ X_{2}=\left[\begin{array}[]{c}\cos\alpha_{12}\sin\alpha_{11}\\ \cos\alpha_{22}\sin\alpha_{21}\\ \vdots\\ \cos\alpha_{n2}\sin\alpha_{n1}\end{array}\right],\cdots,

X_{k-1}=\left[\begin{array}[]{c}\cos\alpha_{1k-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{1l}\\ \cos\alpha_{2k-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{2l}\\ \vdots\\ \cos\alpha_{nk-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{nl}\end{array}\right],\ \ X_{k}=\left[\begin{array}[]{c}\prod\limits_{l=1}^{k-1}\sin\alpha_{1l}\\ \prod\limits_{l=1}^{k-1}\sin\alpha_{2l}\\ \vdots\\ \prod\limits_{l=1}^{k-1}\sin\alpha_{nl}\end{array}\right],

X_{k-1}=\left[\begin{array}[]{c}\cos\alpha_{1k-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{1l}\\ \cos\alpha_{2k-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{2l}\\ \vdots\\ \cos\alpha_{nk-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{nl}\end{array}\right],\ \ X_{k}=\left[\begin{array}[]{c}\prod\limits_{l=1}^{k-1}\sin\alpha_{1l}\\ \prod\limits_{l=1}^{k-1}\sin\alpha_{2l}\\ \vdots\\ \prod\limits_{l=1}^{k-1}\sin\alpha_{nl}\end{array}\right],

X=[X_{1},X_{2}]=\left[\begin{array}[]{cc}\cos\alpha_{11}&\sin\alpha_{11}\\ \cos\alpha_{21}&\sin\alpha_{21}\\ \vdots&\vdots\\ \cos\alpha_{n1}&\sin\alpha_{n1}\end{array}\right].

X=[X_{1},X_{2}]=\left[\begin{array}[]{cc}\cos\alpha_{11}&\sin\alpha_{11}\\ \cos\alpha_{21}&\sin\alpha_{21}\\ \vdots&\vdots\\ \cos\alpha_{n1}&\sin\alpha_{n1}\end{array}\right].

χ_{i} = [cos α_{i 1}, sin α_{i 1}] .

χ_{i} = [cos α_{i 1}, sin α_{i 1}] .

y_{ii} = χ_{i} \cdot χ_{i}^{T} = (cos α_{i 1})^{2} + (s in α_{i 1})^{2} = 1.

y_{ii} = χ_{i} \cdot χ_{i}^{T} = (cos α_{i 1})^{2} + (s in α_{i 1})^{2} = 1.

χ_{i} = [cos α_{i 1}, sin α_{i l} cos α_{i 2}, \dots, cos α_{ik - 1} l = 1 \prod k - 2 sin α_{i l}, l = 1 \prod k - 1 sin α_{i l}] .

χ_{i} = [cos α_{i 1}, sin α_{i l} cos α_{i 2}, \dots, cos α_{ik - 1} l = 1 \prod k - 2 sin α_{i l}, l = 1 \prod k - 1 sin α_{i l}] .

\begin{array}[]{lll}y_{ii}&=&\chi_{i}\cdot\chi_{i}^{T}\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\cos\alpha_{ik-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{il})^{2}+(\prod\limits_{l=1}^{k-1}\sin\alpha_{il})^{2}\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\prod\limits_{l=1}^{k-2}\sin\alpha_{il})^{2}(\cos^{2}\alpha_{ik-1}+\sin^{2}\alpha_{ik-1})\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\cos\alpha_{ik-2}\prod\limits_{l=1}^{k-3}\sin\alpha_{il})^{2}+(\prod\limits_{l=1}^{k-2}\sin\alpha_{il})^{2}\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\prod\limits_{l=1}^{k-3}\sin\alpha_{il})^{2}(\cos^{2}\alpha_{ik-2}+\sin^{2}\alpha_{ik-2})\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\cos\alpha_{ik-3}\prod\limits_{l=1}^{k-4}\sin\alpha_{il})^{2}+(\prod\limits_{l=1}^{k-3}\sin\alpha_{il})^{2}\\ &=&\cdots\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+(\sin\alpha_{il}\sin\alpha_{i2})^{2}\\ &=&(cos\alpha_{i1})^{2}+(sin\alpha_{i1})^{2}\\ &=&1.\end{array}

\begin{array}[]{lll}y_{ii}&=&\chi_{i}\cdot\chi_{i}^{T}\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\cos\alpha_{ik-1}\prod\limits_{l=1}^{k-2}\sin\alpha_{il})^{2}+(\prod\limits_{l=1}^{k-1}\sin\alpha_{il})^{2}\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\prod\limits_{l=1}^{k-2}\sin\alpha_{il})^{2}(\cos^{2}\alpha_{ik-1}+\sin^{2}\alpha_{ik-1})\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\cos\alpha_{ik-2}\prod\limits_{l=1}^{k-3}\sin\alpha_{il})^{2}+(\prod\limits_{l=1}^{k-2}\sin\alpha_{il})^{2}\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\prod\limits_{l=1}^{k-3}\sin\alpha_{il})^{2}(\cos^{2}\alpha_{ik-2}+\sin^{2}\alpha_{ik-2})\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+\cdots+(\cos\alpha_{ik-3}\prod\limits_{l=1}^{k-4}\sin\alpha_{il})^{2}+(\prod\limits_{l=1}^{k-3}\sin\alpha_{il})^{2}\\ &=&\cdots\\ &=&(cos\alpha_{i1})^{2}+(\sin\alpha_{il}\cos\alpha_{i2})^{2}+(\sin\alpha_{il}\sin\alpha_{i2})^{2}\\ &=&(cos\alpha_{i1})^{2}+(sin\alpha_{i1})^{2}\\ &=&1.\end{array}

X=[X_{1},X_{2}]=\left[\begin{array}[]{cc}cos\alpha_{11}&sin\alpha_{11}\\ cos\alpha_{21}&sin\alpha_{21}\\ cos\alpha_{31}&sin\alpha_{31}\end{array}\right].

X=[X_{1},X_{2}]=\left[\begin{array}[]{cc}cos\alpha_{11}&sin\alpha_{11}\\ cos\alpha_{21}&sin\alpha_{21}\\ cos\alpha_{31}&sin\alpha_{31}\end{array}\right].

\begin{array}[]{lll}Y&=&XX^{T}\\ &=&\left[\begin{array}[]{cc}cos\alpha_{11}&sin\alpha_{11}\\ cos\alpha_{21}&sin\alpha_{21}\\ cos\alpha_{31}&sin\alpha_{31}\end{array}\right]\left[\begin{array}[]{cc}cos\alpha_{11}&sin\alpha_{11}\\ cos\alpha_{21}&sin\alpha_{21}\\ cos\alpha_{31}&sin\alpha_{31}\end{array}\right]^{T}\\ \\ &=&\left[\begin{array}[]{ccc}1&cos\alpha_{11}cos\alpha_{21}+sin\alpha_{11}sin\alpha_{21}&cos\alpha_{11}cos\alpha_{31}+sin\alpha_{11}sin\alpha_{31}\\ cos\alpha_{11}cos\alpha_{21}+sin\alpha_{11}sin\alpha_{21}&1&cos\alpha_{21}cos\alpha_{31}+sin\alpha_{21}sin\alpha_{31}\\ cos\alpha_{11}cos\alpha_{31}+sin\alpha_{11}sin\alpha_{31}&cos\alpha_{21}cos\alpha_{31}+sin\alpha_{21}sin\alpha_{31}&1\end{array}\right].\end{array}

\begin{array}[]{lll}Y&=&XX^{T}\\ &=&\left[\begin{array}[]{cc}cos\alpha_{11}&sin\alpha_{11}\\ cos\alpha_{21}&sin\alpha_{21}\\ cos\alpha_{31}&sin\alpha_{31}\end{array}\right]\left[\begin{array}[]{cc}cos\alpha_{11}&sin\alpha_{11}\\ cos\alpha_{21}&sin\alpha_{21}\\ cos\alpha_{31}&sin\alpha_{31}\end{array}\right]^{T}\\ \\ &=&\left[\begin{array}[]{ccc}1&cos\alpha_{11}cos\alpha_{21}+sin\alpha_{11}sin\alpha_{21}&cos\alpha_{11}cos\alpha_{31}+sin\alpha_{11}sin\alpha_{31}\\ cos\alpha_{11}cos\alpha_{21}+sin\alpha_{11}sin\alpha_{21}&1&cos\alpha_{21}cos\alpha_{31}+sin\alpha_{21}sin\alpha_{31}\\ cos\alpha_{11}cos\alpha_{31}+sin\alpha_{11}sin\alpha_{31}&cos\alpha_{21}cos\alpha_{31}+sin\alpha_{21}sin\alpha_{31}&1\end{array}\right].\end{array}

Y=(y_{ij})_{n\times n}=\left\{\begin{array}[]{cc}\sum\limits_{p=1}^{k-1}cos\alpha_{ip}cos\alpha_{jp}\prod\limits_{l=1}^{p-1}\sin\alpha_{il}\sin\alpha_{jl}+\prod\limits_{l=1}^{k-1}\sin\alpha_{il}\sin\alpha_{jl},&i\neq j\\ 1,&i=j\end{array}\right..

Y=(y_{ij})_{n\times n}=\left\{\begin{array}[]{cc}\sum\limits_{p=1}^{k-1}cos\alpha_{ip}cos\alpha_{jp}\prod\limits_{l=1}^{p-1}\sin\alpha_{il}\sin\alpha_{jl}+\prod\limits_{l=1}^{k-1}\sin\alpha_{il}\sin\alpha_{jl},&i\neq j\\ 1,&i=j\end{array}\right..

α \in R^{n \times (k - 1)} min F (α),

α \in R^{n \times (k - 1)} min F (α),

F (α) = d = 1 \sum m i = 1 \sum n - 1 j = i + 1 \sum n (p = 1 \sum k - 1 cos α_{i p} cos α_{j p} l = 1 \prod p - 1 sin α_{i l} sin α_{j l} + l = 1 \prod k - 1 sin α_{i l} sin α_{j l} - A_{ij}^{(d)})^{2} .

F (α) = d = 1 \sum m i = 1 \sum n - 1 j = i + 1 \sum n (p = 1 \sum k - 1 cos α_{i p} cos α_{j p} l = 1 \prod p - 1 sin α_{i l} sin α_{j l} + l = 1 \prod k - 1 sin α_{i l} sin α_{j l} - A_{ij}^{(d)})^{2} .

\nabla F (α) = (\frac{\partial F ( α )}{\partial α _{11}}, \frac{\partial F ( α )}{\partial α _{21}}, \dots, \frac{\partial F ( α )}{\partial α _{n 1}}, \dots, \frac{\partial F ( α )}{\partial α _{1 k - 1}}, \frac{\partial F ( α )}{\partial α _{2 k - 1}}, \dots, \frac{\partial F ( α )}{\partial α _{nk - 1}})^{T},

\nabla F (α) = (\frac{\partial F ( α )}{\partial α _{11}}, \frac{\partial F ( α )}{\partial α _{21}}, \dots, \frac{\partial F ( α )}{\partial α _{n 1}}, \dots, \frac{\partial F ( α )}{\partial α _{1 k - 1}}, \frac{\partial F ( α )}{\partial α _{2 k - 1}}, \dots, \frac{\partial F ( α )}{\partial α _{nk - 1}})^{T},

\begin{array}[]{lll}\frac{\partial F(\alpha)}{\partial\alpha_{\mu\nu}}&=&2\sum\limits_{d=1}^{m}\sum\limits_{i=1,i\neq\mu}^{n}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A^{(d)}_{\mu i})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{i\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\cos\alpha_{\mu\nu}sin\alpha_{i\nu}\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}\\ &+&\cos\alpha_{\mu\nu}\sin\alpha_{i\nu}\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il})\},\end{array}

\begin{array}[]{lll}\frac{\partial F(\alpha)}{\partial\alpha_{\mu\nu}}&=&2\sum\limits_{d=1}^{m}\sum\limits_{i=1,i\neq\mu}^{n}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A^{(d)}_{\mu i})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{i\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\cos\alpha_{\mu\nu}sin\alpha_{i\nu}\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}\\ &+&\cos\alpha_{\mu\nu}\sin\alpha_{i\nu}\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il})\},\end{array}

i = 1 \sum μ - 1 (p = 1 \sum k - 1 cos α_{μ p} cos α_{i p} l = 1 \prod p - 1 sin α_{μ l} sin α_{i l} + l = 1 \prod k - 1 sin α_{μ l} sin α_{i l} - A_{i μ})^{2} +

i = 1 \sum μ - 1 (p = 1 \sum k - 1 cos α_{μ p} cos α_{i p} l = 1 \prod p - 1 sin α_{μ l} sin α_{i l} + l = 1 \prod k - 1 sin α_{μ l} sin α_{i l} - A_{i μ})^{2} +

j = μ + 1 \sum n (p = 1 \sum k - 1 cos α_{μ p} cos α_{j p} l = 1 \prod p - 1 sin α_{μ l} sin α_{j l} + l = 1 \prod k - 1 sin α_{μ l} sin α_{j l} - A_{μ j})^{2} .

j = μ + 1 \sum n (p = 1 \sum k - 1 cos α_{μ p} cos α_{j p} l = 1 \prod p - 1 sin α_{μ l} sin α_{j l} + l = 1 \prod k - 1 sin α_{μ l} sin α_{j l} - A_{μ j})^{2} .

\begin{array}[]{lll}\frac{\partial F(\alpha)}{\partial\alpha_{\mu\nu}}&=&\frac{\partial}{\partial\alpha_{\mu\nu}}\{\sum\limits_{i=1}^{\mu-1}(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A_{i\mu})^{2}\\ &+&\sum\limits_{j=\mu+1}^{n}(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{jp}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{jl}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{jl}-A_{\mu j})^{2}\}\\ &=&2\sum\limits_{i=1}^{\mu-1}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A_{i\mu})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{i\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}cos\alpha_{\mu\nu}sin\alpha_{i\nu}\\ &+&\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}\cos\alpha_{\mu\nu}\sin\alpha_{i\nu})\}\\ &+&2\sum\limits_{j=\mu+1}^{n}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{jp}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{jl}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{jl}-A_{\mu j})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{j\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{jl}+\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{jl}cos\alpha_{\mu\nu}sin\alpha_{j\nu}\\ &+&\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{jp}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{jl}\cos\alpha_{\mu\nu}\sin\alpha_{j\nu})\}.\end{array}

\begin{array}[]{lll}\frac{\partial F(\alpha)}{\partial\alpha_{\mu\nu}}&=&\frac{\partial}{\partial\alpha_{\mu\nu}}\{\sum\limits_{i=1}^{\mu-1}(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A_{i\mu})^{2}\\ &+&\sum\limits_{j=\mu+1}^{n}(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{jp}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{jl}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{jl}-A_{\mu j})^{2}\}\\ &=&2\sum\limits_{i=1}^{\mu-1}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A_{i\mu})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{i\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}cos\alpha_{\mu\nu}sin\alpha_{i\nu}\\ &+&\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}\cos\alpha_{\mu\nu}\sin\alpha_{i\nu})\}\\ &+&2\sum\limits_{j=\mu+1}^{n}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{jp}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{jl}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{jl}-A_{\mu j})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{j\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{jl}+\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{jl}cos\alpha_{\mu\nu}sin\alpha_{j\nu}\\ &+&\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{jp}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{jl}\cos\alpha_{\mu\nu}\sin\alpha_{j\nu})\}.\end{array}

\begin{array}[]{lll}\frac{\partial F(\alpha)}{\partial\alpha_{\mu\nu}}&=&2\sum\limits_{i=1,i\neq\mu}^{n}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A_{\mu i})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{i\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\cos\alpha_{\mu\nu}sin\alpha_{i\nu}\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}\\ &+&\cos\alpha_{\mu\nu}\sin\alpha_{i\nu}\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il})\},\end{array}

\begin{array}[]{lll}\frac{\partial F(\alpha)}{\partial\alpha_{\mu\nu}}&=&2\sum\limits_{i=1,i\neq\mu}^{n}\{(\sum\limits_{p=1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\prod\limits_{l=1}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}-A_{\mu i})\\ &\times&(-\sin\alpha_{\mu\nu}\cos\alpha_{i\nu}\prod\limits_{l=1}^{\nu-1}\sin\alpha_{\mu l}\sin\alpha_{il}+\cos\alpha_{\mu\nu}sin\alpha_{i\nu}\prod\limits_{l=1,l\neq\nu}^{k-1}\sin\alpha_{\mu l}\sin\alpha_{il}\\ &+&\cos\alpha_{\mu\nu}\sin\alpha_{i\nu}\sum\limits_{p=\nu+1}^{k-1}cos\alpha_{\mu p}cos\alpha_{ip}\prod\limits_{l=1,l\neq\nu}^{p-1}\sin\alpha_{\mu l}\sin\alpha_{il})\},\end{array}

d_{t}=\left\{\begin{array}[]{cc}-g_{t},&t=0\\ -g_{t}+\frac{g^{T}_{t}g_{t}}{g^{T}_{t-1}g_{t-1}}d_{t-1},&t\geq 1\end{array}\right..

d_{t}=\left\{\begin{array}[]{cc}-g_{t},&t=0\\ -g_{t}+\frac{g^{T}_{t}g_{t}}{g^{T}_{t-1}g_{t-1}}d_{t-1},&t\geq 1\end{array}\right..

\left\{\begin{array}[]{c}F(\alpha_{t+1})\leq F(\alpha_{t})+\delta\rho^{m_{t}}g_{t}^{T}d_{t}\\ \mid g_{t+1}^{T}d_{t}\mid\leq-\sigma g_{t}^{T}d_{t}\end{array}\right..

\left\{\begin{array}[]{c}F(\alpha_{t+1})\leq F(\alpha_{t})+\delta\rho^{m_{t}}g_{t}^{T}d_{t}\\ \mid g_{t+1}^{T}d_{t}\mid\leq-\sigma g_{t}^{T}d_{t}\end{array}\right..

Ω (α_{0}) = {α \in R^{n \times (k - 1)} ∣ F (α) \leq F (α_{0})}

Ω (α_{0}) = {α \in R^{n \times (k - 1)} ∣ F (α) \leq F (α_{0})}

t \to \infty lim in f ∥ \nabla F (α_{t}) ∥_{F} = 0.

t \to \infty lim in f ∥ \nabla F (α_{t}) ∥_{F} = 0.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On the generalized low rank approximation of the correlation matrices arising in the asset portfolio

††thanks: The work was supported by National Natural Science Foundation of China (Nos. 11101100; 11261014; 11301107; 61362021), Natural Science Foundation of Guangxi Province (No. 2012GXNSFBA053006; 2013GXNSFBA019009; 2013GXNSFBB053005; 2013GXNSFDA019030), the Fund for Guangxi Experiment Center of Information Science (20130103), Innovation Project of GUET Graduate Education (GDYCSZ201473), Innovation Project of Guangxi Graduate Education (YCSZ2014137), and Guangxi Key Lab of Wireless Wideband Communication and Signal Processing open grant 2012.

Xuefeng Duan 111Corresponding author.E-mail address:[email protected](X. Duan), [email protected](J. Bai). Jianchao Bai Maojun Zhang Xinjun Zhang

*College of Mathematics and Computational Science, Guilin University of

Electronic Technology, Guilin 541004, P.R. China*

Abstract In this paper, we consider the generalized low rank approximation of the correlation matrices problem which arises in the asset portfolio. We first characterize the feasible set by using the Gramian representation together with a special trigonometric function transform, and then transform the generalized low rank approximation of the correlation matrices problem into an unconstrained optimization problem. Finally, we use the conjugate gradient algorithm with the strong Wolfe line search to solve the unconstrained optimization problem. Numerical examples show that our new method is feasible and effective.

Keywords: Generalized low rank approximation; Correlation matrix; Asset portfolio; Feasible set; Conjugate gradient algorithm

AMS subject classifications. 11D07; 68W25; 65F30

1. Introduction

Throughout this paper, we use $R^{n\times n}$ and $S^{+}_{n}$ to denote the set of $n\times n$ real matrices and symmetric positive semidefinite matrices, respectively. We use $A^{T}$ and $tr(A)$ to represent the transpose and trace of the matrix $A$ , respectively. The symbols $\|A\|_{F}$ and $rank(A)$ denote the Frobenius norm and the rank of the matrix $A,$ respectively. The symbol $diag(Y)$ stands for the vector whose elements lie in the diagonal line of the matrix $Y,$ and the symbol $e$ stands for the vector whose elements are of all ones, i.e., $e=(1,1,\cdots,1)^{T}.$

In this paper, we consider the following problem named generalized low rank approximation of the correlation matrices.

Problem 1.1. Given some correlation matrices $A^{(d)}\in R^{n\times n},\ d=1,2,\cdots,m$ , and a positive integer $k,\ 1\leq k<n$ , find a correlation matrix $\widehat{Y}$ whose rank is less than and equal to $k$ such that

[TABLE]

Problem (1.1) arises in the asset portfolio (see [10] for more details), which can be stated as follows. Suppose that $R=DCD$ is the covariance matrix of $n$ assets, where $C$ is a correlation matrix and $D$ is a diagonal matrix with positive variances which are specially used to describe the risk of assets. In practice, the covariance matrix is usually estimated by the historical data of the return of each asset, that is, an approximation covariance is obtained by statistics method. Let

[TABLE]

be the approximation covariance with $d$ th sampling some data, where $D^{(d)}$ and $C^{(d)}$ are the $d$ th approximation diagonal matrix and correlation matrix, respectively. Higham [4] proposed a method for finding the nearest low rank approximation of a correlation matrix by only one sampling(i.e., $m=1$ ). However, it is difficult for the decision maker to choose the best approximation covariance matrix with only one sampling because there is always a noise in the data on the prices of assets. Thus, we develop a repeated sampling method to get a series of approximation covariance matrices, that is, $d$ comes from $1$ to $m$ . Obviously, it is very easy to obtain the optimal diagonal matrix $\widehat{D}$ by a series of $D^{(d)}$ . The major obstacle to finding the optimal covariance matrix is conducting the optimal correlation matrix $\widehat{C}$ from a series of $C^{(d)}$ . The above consideration leads to solving the following problem: given some correlation matrices $A^{(1)},A^{(2)},\cdots,A^{(m)}\in R^{n\times n}$ , find a correlation matrix $\widehat{Y}$ such that

[TABLE]

Meanwhile, for the large financial correlation matrices, usually almost all variances can be attributed to some stochastic Brownian factors. Therefore, instead of taking into account all Brownian motions, we would wish to simulate with a smaller number of factors, i.e., $rank(Y)<n$ and typically $rank(Y)$ is from 1 to $k$ . Then the problem (1.2) with rank constraint becomes problem (1.1).

Noting that the matrix $Y$ in problem (1.1) is not only positive semidefinite but also satisfies $rank(Y)\leq k$ , so problem (1.1) belongs to the structured low rank approximation problem. As Gillard-Zhigljavsky [3] said, the structured low rank approximation is a difficult optimization problem, so there is much work to be done. In the last few years, there has been a constantly increasing interest in developing the theory and numerical methods for the nearest low rank approximation of a correlation matrix, due to their wide applications in the fiance and risk management [6], machine learning [15], stress testing of bank [13], industrial process monitoring [7] and image processing [5]. Recently, problem (1.1) with $m=1$ has been extensively studied, and the research results mainly concentrate on the following two cases. One is without the rank constraint and the other is with the rank constraint.

For the case without the rank constraint, Higham [4] proposed an alternative projection algorithm to solve the nearest correlation matrix problem by defining two projection operators. Under some proper assumptions, Li-Li [8] developed a projected semismooth Newton method to solve the problem of calibrating least squares covariance matrix. Qi and Sun [12] proposed a Newton-type method for the nearest correlation matrix problem, and the quadratic convergence of the new method was proved. An unconstrained convex optimization approach was proposed to find the nearest correlation matrix to the target matrix with the fixed correlations unaltered in [13]. Besides, Qi-Sun [14] introduced an augmented Lagrangian dual method for for the H-weighted nearest correlation matrix problem. This method solves a sequence of unconstrained strongly convex optimization problems, each of which can be solved by a semismooth Newton method combined with the conjugate gradient method. Recently, Yin, etc [18, 20] developed two new alternative gradient algorithms to compute the nearest correlation matrix by making use of the alternative gradient method.

For the case with the rank constraint, by making use of the fact that

[TABLE]

Gao and Sun [2] proposed a majorized penalty approach for solving the rank constrained correlation matrix problem. It is noted that Gao and Sun’s majorized penalty approach can deal with some large scale problems ( $n\geq 500$ ). Motivated by the method in [12] and based on a well-known result that the sum of the largest eigenvalues of a symmetric matrix can be represented as a semidefinite programming problem, Li-Qi [9] proposed a novel sequential semismooth Newton method to solve problem (1.1) with $m=1$ . They formulate the problem as a bi-affine semidefinite programming and then use an augmented Lagrange method to solve a sequence of least squares problems. Both Simon-Abell [16] and Pietersz-Groenen [11] used majorization approach to solve the low rank approximation of a correlation matrix. The difference lies in that the former solved the problem with any weighted norm while the latter only settled it with Frobenius norm. By constructing a Lagrange function, Zhang-Wu [21] transformed the low rank approximation of a correlation matrix into a min-max problem, where the inner maximization problem was solved with closed form spectral decomposition and the outer minimization problem was solved with gradient-based methods. In [1], Grubisic and Pietersz introduced a geometric programming approach to solve the low rank nearest correlation matrix problem. The method could be used to minimize any sufficiently smooth objective function.

However, the research results of problem (1.1) with $m>1$ are very few as far as we know. The greatest difficulties to solve problem (1.1) are how to characterize the feasible set and deal with the complex structure. In this paper, we overcome these difficulties by using the Gramian representation together with a special trigonometric function transform. Then problem (1.1) is transformed into an unconstrained optimization problem. Finally, the conjugate gradient method with the strong Wolfe line search is given to solve the unconstrained optimization problem. Numerical examples show that our new method is feasible and effective.

**2. Main results **

In this section, we first transform problem (1.1) into an unconstrained optimization problem by making use of the Gramian representation together with a special trigonometric function transform. Then we use the conjugate gradient algorithm with the strong Wolfe line search to solve it.

We first define the following set

[TABLE]

It is easy to characterize the set $S$ by using the Gramian representation (see [17]), i.e.,

[TABLE]

Set

[TABLE]

It is easy to verify that the feasible set of problem (1.1) is $S\bigcap\Gamma$ . The most difficulty to solve problem (1.1) is how to characterize the feasible set. Now we begin to use the Gramian representation together with a special trigonometric function transform to characterize the feasible set $S\bigcap\Gamma$ .

Theorem 2.1. Let the matrix $X$ be

[TABLE]

Suppose

[TABLE]

where $\alpha_{ij}\in R,\ i=1,2,\cdots,n,\ j=1,2,\cdots,k-1,$ then the matrix $Y=XX^{T}\in R^{n\times n}$ is not only symmetric positive semidefinite, but also satisfies $rank(Y)\leq k$ and $diag(Y)=e.$

Proof. By using the Gramian representation, it is easy to verify that the matrix $Y$ is symmetric positive semidefinite and satisfies $rank(Y)\leq k$ . Hence, we only need to prove $diag(Y)=e$ .

Consider the matrix $X$ with $k=2$ . According to the assumptions, we have

[TABLE]

Let $\chi_{i}\ (i=1,2,\cdots,n)$ be the $i$ th row of the matrix $X$ , that is,

[TABLE]

By multiplying $\chi_{i}$ and $\chi_{i}^{T}$ , we get the element $y_{ii}$ of the matrix $Y$ , that is,

[TABLE]

That is to say, $diag(Y)=e$ . Hence, Theorem 2.1 holds when $k=2$ .

When $k>2$ , without loss of generality, we take the $i$ th row of the matrix $X$ and write it as $\chi_{i},\ i=1,2,\cdots,n$ , then

[TABLE]

By multiplying $\chi_{i}$ and $\chi_{i}^{T}$ , we get the element $y_{ii}$ of the matrix $Y$ , that is,

[TABLE]

Hence, for any $k\geq 2$ , we have $y_{ii}=1,\ i=1,2,\cdots,n$ , that is, $diag(Y)=e.$ $\ \ \ \ \Box$

Remark 2.1. As Simon and Abell [16] said, a correlation matrix is a symmetric positive semidefinite matrix with unit diagonal, and any symmetric positive semidefinite matrix with unit diagonal is a correlation matrix. In Theorem 2.1, the matrix $Y$ must be a correlation matrix, and noting that $\alpha_{ij},\ \ i=1,2,\cdots,n,\ \ j=1,2,\cdots,k-1$ are arbitrary real number, so the matrix $Y=XX^{T}$ can be represented all the correlation matrices.

Remark 2.2. To explain Theorem 2.1, we take a $3\times 2$ matrix for example. Set

[TABLE]

By a simple calculation, we can obtain that

[TABLE]

Obviously, the matrix $Y$ is not only symmetric positive semidefinite , but also satisfies $rank(Y)\leq 2$ and $diag(Y)=e.$

By using the similar way in the proof of Theorem 2.1, we can obtain the other elements of the matrix $Y$ , that is,

[TABLE]

Substituting $y_{ij}$ into problem (1.1), it is easy to obtain that problem (1.1) can be written as the following unconstrained optimization problem.

Problem 2.1. Given some correlation matrices $A^{(d)}=(A^{(d)}_{ij})_{n\times n},\ d=1,2,\cdots,m$ , and a positive integer $k,\ 1\leq k<n$ , find the solution $\widehat{\alpha}\in R^{n\times(k-1)}$ of the following optimization problem

[TABLE]

where

[TABLE]

Nextly, we will use the conjugate gradient algorithm with the strong Wolfe line search to solve the unconstrained optimization problem. The most difficulty to solve problem (2.1) is how to compute the gradient of the objective function $F(\alpha)$ . Now we begin to compute the gradient of the objective function.

Theorem 2.2. The gradient of the objective function $F(\alpha)$ of problem (2.1) is

[TABLE]

where

[TABLE]

here $\mu=1,2,\cdots,n,\ \nu=1,2,\cdots,k-1$ .

Proof. To prove Theorem 2.2, we only need to prove (2.3) holds when $m=1$ , because the forms of the expression of the gradient of the objective function $F(\alpha)$ with $m=1$ are the same as that with $m>1$ .

For $m=1$ , noting that the total numbers including $\alpha_{\mu\nu}$ in $F(\alpha)$ are

[TABLE]

Hence, the derivative of $F(\alpha)$ at $\alpha_{\mu\nu}$ is

[TABLE]

Because $A_{i\mu}=A_{\mu i}$ , we turn $j$ to $i$ and conclude that

[TABLE]

where $\mu=1,2,\cdots,n,\ \nu=1,2,\cdots,k-1$ . $\ \ \ \ \Box$

Consequently, the conjugate gradient algorithm with the strong Wolfe line search to solve the minimization problem (2.1) can be described in Algorithm 2.1.

**Algorithm 2.1 (This algorithm attempts to solve problem (2.1))

Step 1. Given parameters $\rho\in(0,1),\ \delta\in(0,0.5),\sigma\in(\delta,0.5)$ , and tolerance error $0\leq tol\ll 1$ . Choose an initial iterative matrix $\alpha_{0}\in R^{n\times(k-1)}$ . Set $t:=0$ .

Step 2. Calculate $g_{t}=\nabla F(\alpha_{t})$ . If $\parallel g_{t}\parallel_{F}<tol$ , stop and output $\alpha^{*}\approx\alpha_{t}$ .

Step 3. Determine the search direction $d_{t}$ , where**

[TABLE]

Step 4. Confirm the step length $\beta_{t}$ by applying the strong Wolfe line search, i.e.,

[TABLE]

Set $\beta_{t}=\rho^{m_{t}},\ \gamma_{t}=\alpha_{t}(:),\ \gamma_{t+1}=\gamma_{t}+\beta_{t}d_{t},\ \alpha_{t+1}=reshape(\gamma_{t+1},n,k-1).$

Step 5. Set $t:=t+1$ . Go to step 2.

Remark 2.3. To implement Algorithm 2.1, we first need to create three matlab files, fun file, gfun file and frac file, where the fun file is used to compute $F(\alpha_{t})$ , the gfun file is used to calculate $\nabla F(\alpha_{t})$ , and the frac file is used to minimize $F(\alpha)$ . In addition, the function $\alpha_{t}(:)$ returns the $n$ by $k-1$ vector $\gamma_{t}$ whose elements are taken column-wise from the matrix $\alpha_{t}$ , and the function $reshape(\gamma_{t+1},n,k-1)$ returns the $n$ by $k-1$ matrix $\alpha_{t+1}$ whose elements are taken column-wise from $\gamma_{t+1}$ .

By Theorem 4.3.5 [19, P.203], we can establish the global convergence theorem for Algorithm 2.1.

Theorem 2.3. Suppose the function $F(\alpha)$ is twice continuous and differentiable, the level set

[TABLE]

is bounded, and the step length $\beta_{t}$ is generated by (2.4), where $\delta<\sigma<0.5$ . Then the sequence $\{\alpha_{t}\}$ generated by Algorithm 2.1 is guaranteed to globally converge, that is,

[TABLE]

**3. Numerical Experiments **

In this section, we use two numerical examples to illustrate that Algorithm 2.1 is feasible to solve problem (2.1). All experiments are tested in Matlab R2010a. We denote the relative residual error

[TABLE]

and the gradient norm

[TABLE]

where $\alpha_{t}$ is the $t$ th iterative matrix of Algorithm 2.1. We use the stopping criterion

[TABLE]

And we choose the random matrix $rand(m,n)$ as the initial value in the following examples, where the random matrix is generated by the Matlab function $rand(m,n).$

Example 3.1. Consider problem (2.1) with $m=1$ and

[TABLE]

Case I: Set k=3. We use Algorithm 2.1 with the initial value

[TABLE]

to solve problem (2.1). After 15 iterations, we get the solution $\widehat{\alpha}$ of problem (2.1)

[TABLE]

Hence, the solution $\widehat{Y}$ of problem (1.1) is

[TABLE]

And the curves of the relative residual error $\epsilon(t)$ and the gradient norm $\|\nabla F(\alpha_{t})\|_{F}$ are in Fig. 1.

Case II: Set k=2. We use Algorithm 2.1 with the initial value

[TABLE]

to solve problem (2.1). After 13 iterations, we get the solution $\widehat{\alpha}$ of problem (2.1)

[TABLE]

Hence, the solution $\widehat{Y}$ of problem (1.1) is

[TABLE]

And the curves of the relative residual error $\epsilon(t)$ and the gradient norm $\|\nabla F(\alpha_{t})\|_{F}$ are in Fig. 2.

In order to compare our algorithm with the Major algorithm in [11], we use them to solve problem (2.1) with the same initial value. We list the number of iteration (denoted by ”IT”), CPU time (denoted by ”CPU”), the gradient norm (denoted by ”GN”) and the relative residual error (denoted by ”ERR”) in Table 1.

[TABLE]

Example 3.1 shows that Algorithm 2.1 is feasible to solve problem (1.1). Especially, Table 1 shows that our algorithm outperforms the Major algorithm [11] in both iterations and CPU time, which indicates that our algorithm has faster convergence rate than the Major algorithm.

Nextly, we will use an example to show that our algorithm can be used to solve the generalized low rank approximation of correlation matrices arising in the asset portfolio.

Example 3.2. It is an important issue to calculate the more exact correlation matrix of assets in the portfolio selection. For instance, suppose that an investor uses one unit money to buy a total of $11$ assets at the beginning of one period. There is a relationship between any two assets of the portfolio because the price of each asset is related to some common factors in the financial market. The correlation matrix is one of the methods measuring the relation between assets. However, how to accurately compute the correlation matrix is the key problem for the investor since the optimal investment policies is affected by the uncertainty of parameters in the correlation matrix. The daily price data of each asset in the portfolio are taken from the Wind database, which is a Chinese financial database, in order to obtain the correlation matrix. Five sets of the daily data are got by the sampling based on five different periods of the data. Using the Matlab software, five correlation matrix of the eleven assets are given as follows.

[TABLE]

Set k=3, and we use Algorithm 2.1 with the initial value

[TABLE]

to solve problem (2.1). After 57 iterations, we get the solution $\widehat{\alpha}$ of problem (2.1)

[TABLE]

Hence, the solution $\widehat{Y}$ of problem (1.1) is

[TABLE]

And the curves of the relative residual error $\epsilon(t)$ and the gradient norm $\|\nabla F(\alpha_{t})\|_{F}$ are in Fig. 3.

For the above example, we use Algorithm 2.1 to solve problem (2.1) with different rank. We list the number of iteration (denoted by ”IT”) , CPU time (denoted by ”CPU”), the gradient norm (denoted by ”GN”) and the relative residual error (denoted by ”ERR”) in Table 2.

[TABLE]

Fig. 3 and Table 2 show that Algorithm 2.1 can be used to solve the generalized low rank approximation of correlation matrices arising in the asset portfolio. What is more important, when the investor uses the matrix $\widehat{Y}$ obtained by using Algorithm 2.1 to analyze the relationship between any two assets, some noise in the data can be reduced because the correlation matrix of assets is an important factor for selecting assets in portfolio.

**4. Conclusion ** The generalized low rank approximation of correlation matrices is widely used in the asset portfolio and risk management. It is a difficult matrix optimization problem, and the difficulties lie in how to deal with its feasible set and complex structure. In this paper, we use the Gramian representation together with special trigonometric function transform to overcome these difficulties, and develop a new algorithm to solve it. Numerical examples show that our new method is feasible and effective. Moreover, the theory and algorithm of this paper can be extended to solve the low rank approximation in Li-Qi [9], that is, the nearest low rank approximation of a correlation matrix to the given symmetric matrix.

Acknowledgements

The authors wish to thank Prof. Richard A Brualdi and the anonymous referee for providing very useful suggestions for improving this paper. The authors also thank Prof. Qingwen Wang for discussing the properties of the objective function.

Bibliography23

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1]
2[2] I. Grubisic, R. Pietersz, Efficient rank reduction of correlation matrices, Linear Algebra Appl. 422 (2007) 629-653.
3[3] Y. Gao, D. Sun, A majorized penalty approach for calibrating rank constrained correlation matrix problems, Technical Report, Department of Mathematics, National University of Singapore, March 2010.
4[4] J. Gillard, A. Zhigljavsky, Analysis of structured low rank approximation as an optimization problem, Inform. 22 (2011) 489-505.
5[5] N. Higham, Computing the nearest correlation matrix - A problem from finance, IMA J. Numer. Anal. 22 (2002) 329-343.
6[6] W. Hoge, A subspace identification extension to the phase correlation method, IEEE Trans. Med. Imaging 22 (2003) 277-280.
7[7] P.H. Kupiec, Stress testing in a value at risk framework, J. Derivatives 6 (1988) 7-24.
8[8] T. Kourti, Process analysis and abnormal situation detection: from theory to practice, IEEE Control Syst. Mag. 22 (2002) 10-25.