Generalized frame operator distance problems
Pedro G. Massey β, Noelia B. Rios β and Demetrio Stojanoff
111Partially supported by CONICET
(PIP 0150/14), FONCyT (PICT 1506/15) and FCE-UNLP (11X681), Argentina. Β 222 e-mail addresses: [email protected], [email protected], [email protected]
CMaLP-FCE-UNLP
and IAM-CONICET, Argentina
Abstract
Let SβMdβ(C)+ be a positive semidefinite dΓd complex matrix and let a=(aiβ)iβIkβββR>0kβ, indexed by
Ikβ={1,β¦,k}, be a k-tuple of positive numbers. Let Tdβ(a) denote the set of families G={giβ}iβIkβββ(Cd)k
such that β₯giββ₯2=aiβ, for iβIkβ; thus, Tdβ(a) is the product of spheres in Cd endowed
with the product metric. For a strictly convex unitarily invariant norm N in Mdβ(C), we consider the generalized frame operator distance
function Ξ(N,S,a)β defined on Tdβ(a), given by
[TABLE]
In this paper we determine the geometrical and spectral structure of local minimizers G0ββTdβ(a) of Ξ(N,S,a)β. In particular,
we show that local minimizers are global minimizers, and that these families do not depend on the particular choice of N.
AMS subject classification: 42C15, 15A60.
Keywords: matrix approximation, unitarily invariant norms, majorization,
frame operator distance.
Contents
-
1 Introduction
-
2 Preliminaries
-
2.1 Matrix Analysis
-
2.2 Finite frames
-
3 Generalized frame operator distance functions
-
3.1 Statement of the problem and related results
-
3.2 Inner structure of local minimizers of GFODβs
-
3.3 The co-feasible case for kβ₯d.
-
4 Main results
-
4.1 When kβ₯d
-
4.2 The general case
-
5 Proof of some technical results
1 Introduction
Matrix approximation problems are ubiquitous in applications of matrix analysis. Following [14] these problems can be briefly described
as follows: given SβMdβ(C), a complex matrix of size d,
a matrix norm N in Mdβ(C),
and a set XβMdβ(C) then we search for the minimal distance
[TABLE]
and for the best approximations
of S from X (or nearest members in X)
[TABLE]
Solving these problems,
that are also known as matrix nearness or Procrustes problems in the literature
(see for example the recent text [13], and the classic books of Bhatia [4] and Kato [15])
amounts to
provide a characterization and, if possible, an explicit computation (in some cases sharp estimations)
of dNβ(S,X) and of the set of best approximations
ANopβ(S,X).
A typical choice for N is the Frobenius norm (also called 2-norm) since it is an euclidean
norm (i.e. it is the norm associated with an inner product
in Mdβ(C)). Still, some other norms are also of interest such as weighted norms, the p-norms for 1β€p (that contain the Frobenius norm), or the
more general class of unitarily invariant norms. Some of the most important choices for X are the set of: selfadjoint matrices,
positive semidefinite matrices, correlation matrices, orthogonal projections, oblique projections, matrices with rank bounded by a fix number
(see [11, 12, 14, 16, 25]).
Once the nearness problem above has been solved for some S, some set X and norm N in Mdβ(C) then,
a natural proximity problem arises: for a fixed A0ββX, we search for (some sharp upper bound of) the distance
[TABLE]
where dXβ denotes a metric in X. In case X can be endowed with a smooth structure that is compatible with
dXβ and such that Ξ¨(A)=N(A0ββA) is also a smooth function on X, then estimations of
dXβ(A0β,ANopβ(S,X)) can be obtained by
applying gradient descent algorithms for Ξ¨ or by studying the evolution of the solutions of flows in X associated with the gradient of Ξ¨.
Motivated by some optimization problems in finite frame theory, in [18] we considered the following matrix nearness problem.
Fix an arbitrary positive semidefinite SβMdβ(C)+ and a finite sequence of positive numbers a=(aiβ)iβIkβββ(R>0β)k,
indexed by Ikβ={1,β¦,k}; we considered the sets
[TABLE]
With this notation we solved the matrix nearness problem corresponding
to XaββMdβ(C)+, for an
arbitrary strictly convex unitarily invariant norm N in Mdβ(C). That is, we obtained an explicit description
of dNβ(S,Xaβ)=dNβ(S,a) and
ANopβ(S,Xaβ)=ANopβ(S,a).
We point out that the set Xaβ above can also be described
as the set of frame operators SGβ of finite sequences G={giβ}iβIkβββTdβ(a) (see Section 2 for details).
It is then natural to consider the proximity problem associated to the matrix nearness
problem that we just described. Indeed, because of our initial motivation on this problem,
we further pose the following (stronger) version: for G0ββTdβ(a) search for a (sharp) upper bound
of
[TABLE]
where
[TABLE]
for G0β={gi0β}iβIkββ,G={giβ}iβIkβββTdβ(a).
That is, we shift our attention from frame operators SGβ to finite sequences GβTdβ(a).
Notice that in the particular case S=dkβI, aiβ=1 for iβIkβ and
N is the Frobenius norm,
this problem is related with Paulsenβs proximity problem [5, 6, 7], which is a central open problem in finite frame theory.
In case the norm N is sufficiently smooth, we could apply gradient descent algorithms to the function
Ξ=Ξ(N,S,a)β defined on
Tdβ(a) - which is a smooth manifold in (Cd)k - given by
Ξ(G)=N(SβSGβ), starting at G0β. Such an approach was considered by N. Strawn [23, 24] for the Frobenius norm N.
Also, we could study the evolution of solutions of gradient flows as considered in [17].
In the general case, the analysis of the behavior of gradient descent algorithms leads to the study
the local behavior of the map
[TABLE]
One important issue is determining whether local minimizers of Ξ
(that are natural attractors of gradient descent algorithms) are actually global minimizers.
In [18] we settled this question in the affirmative for the Frobenius norm (thus solving a conjecture in [23]), by relating
frame operator distance problems in the Frobenius norm with frame completion problems for the Benedetto-Fickus frame potential introduced in [3].
Unfortunately, the techniques used in [18] do not apply for arbitrary N (not even for p-norms with p>1, pξ =2).
In the present work we tackle this problem and show
that, in case N is an arbitrary strictly convex u.i.n., local minimizers of Ξ are characterized by a spectral condition that does not depend on N, but only
on S and a. In particular, we conclude that local minimizers are global minimizers and do not depend on the particular choice of N. Our techniques
rely on majorization theory and Lidskiiβs local theorems for unitarily invariant norms obtained in [19]; indeed, in that paper
we showed that in some particular cases, local minimizers of the generalized frame operator distance (GFOD) functions
(i.e. Ξ(G)=N(SβSGβ)) are global minimizers. Based on the
features of these particular cases, we introduce the notion of co-feasible GFOD problems. Although in general GFOD problems are not co-feasible, this notion plays a crucial role in the study of the spectral structure of local minimizers.
Using that the map Tdβ(a)βGβ¦SGββXaβ is continuous, as a byproduct we obtain that local minimizers SG0βββXaβ of the function
[TABLE]
are global minimizers and do not depend on the choice of
strictly convex u.i.n. N. This last fact is weaker than the result for the functions Ξ, since the continuous map Tdβ(a)βGβ¦SGββXaβ
does not have local cross sections around an arbitrary G0ββTdβ(a).
The paper is organized as follows. In Section 2 we include some preliminary material on matrix analysis and finite frame theory that
is used throughout the paper. In Section 3 we state our main problem namely, the study of the geometrical and spectral structure of local minimizers of the GFOD functions (i.e. Ξ as above),
associated to a strictly convex unitarily invariant norm.
We begin by obtaining a series of results related with what we call the inner structure of such local minimizers.
In section 4 we state our main results namely, that local minimizers of GFOD functions are
global minimizers, and give an algorithmic construction of the eigenvalues of such families.
Finally, in Section 5 we give detailed proofs of some results stated in Section 3.
2 Preliminaries
In this section we introduce the notation, terminology and results from matrix analysis (see the text [4])
and finite frame theory (see the texts [8, 9, 10]) that we will use throughout the
paper.
2.1 Matrix Analysis
Notation and terminology. We let Mk,dβ(C) be the space of complex kΓd matrices and write Md,dβ(C)=Mdβ(C) for the algebra of dΓd complex matrices. We denote by H(d)βMdβ(C) the real subspace of selfadjoint matrices and by Mdβ(C)+βH(d) the cone of positive
semi-definite matrices. We let U(d)βMdβ(C) denote the group of unitary matrices.
For dβN, let Idβ={1,β¦,d}.
Given a vector xβCd we denote by Dxβ the diagonal matrix in Mdβ(C) whose main diagonal is x.
Given x=(xiβ)iβIdβββRd we denote by xβ=(xiββ)iβIdββ the vector obtained by
rearranging the entries of x in non-increasing order. We also use the notation
(Rd)β={xβRdΒ :Β x=xβ} and (Rβ₯0dβ)β={xβRβ₯0dβΒ :Β x=xβ}. For rβN, we let \mathds1rβ=(1,β¦,1)βRr.
Given a matrix AβH(d) we denote by Ξ»(A)=Ξ»β(A)=(Ξ»iβ(A))iβIdβββ(Rd)β
the eigenvalues of A counting multiplicities and arranged in
non-increasing order.
For BβMdβ(C) we let s(B)=Ξ»(β£Bβ£) denote the singular values of B, i.e. the eigenvalues of β£Bβ£=(BβB)1/2βMdβ(C)+; we also let Ο(B)βC denote the spectrum of B.
If x,yβCd we denote by xβy=xyββMdβ(C) the rank-one matrix given by (xβy)z=β¨z,yβ©Β x, for zβCd.
Next we recall the notion of majorization between vectors, that will play a central role throughout our work.
Definition 2.1**.**
Let xβRk and yβRd. We say that x is
submajorized by y, and write xβΊwβy, if
[TABLE]
If xβΊwβy and trx=βi=1kβxiβ=βi=1dβyiβ=try, then x is
majorized by y, and write xβΊy.
Remark 2.2**.**
Given x,yβRd we write
xβ©½y if xiββ€yiβ for every iβIdβ. It is a standard exercise
to show that:
-
xβ©½yβΉxββ©½yββΉxβΊwβy.
2. 2.
xβΊyβΉβ£xβ£βΊwββ£yβ£, where β£xβ£=(β£xiββ£)iβIdβββRβ₯0dβ.
3. 3.
xβΊy and β£xβ£β=β£yβ£ββΉxβ=yβ.
4. 4.
xβΊy and zβΊwβReβΉ(x,z)βΊ(y,w)βRd+e.
β³
Although majorization is not a total order in Rd, there are several fundamental inequalities in
matrix theory that can be described in terms of this relation. As an example of this phenomenon we can consider
Lidskiiβs (additive) inequality (see [4]). In the following result we also include
the characterization of the case of equality obtained in [22].
Theorem 2.3** (Lidskiiβs inequality).**
Let A,BβH(d). Then
-
Ξ»(A)βΞ»(B)βΊΞ»(AβB).
2. 2.
\lambda(A-B)=\big{(}\,\lambda(A)-\lambda(B)\,\big{)}^{\downarrow} if and only if there exists
{viβ}iβIdββ an ONB of Cd such that
[TABLE]
Notice that in this case, A and B commute. β
Recall that a norm N in Mdβ(C) is unitarily invariant (briefly u.i.n.) if
[TABLE]
and N is strictly convex if its restriction to diagonal matrices is a strictly convex norm in Cd.
Examples of u.i.n. are the spectral norm β₯β
β₯ and the p-norms β₯β
β₯pβ, for pβ₯1
(strictly convex if p>1).
It is well known that (sub)majorization relations between singular values of matrices are intimately related
with inequalities with respect to u.i.nβs.
The following result summarizes these relations (see for example [4]):
Theorem 2.4**.**
Let A,BβMdβ(C) be such that s(A)βΊwβs(B). Then:
-
For every u.i.n. N in Mdβ(C)
we have that N(A)β€N(B).
2. 2.
If N is a strictly convex u.i.n. in Mdβ(C)
and N(A)=N(B), then s(A)=s(B).β
2.2 Finite frames
We consider some notions and results from the theory of finite frames. In what follows we adopt:
Notation and terminology: let F={fiβ}iβIkββ be a finite sequence in Cd. Then,
-
TFββMd,kβ(C) denotes the synthesis operator of F given by TFββ
(Ξ±iβ)iβIkββ=βiβIkββΞ±iβfiβ.
2. 2.
TFβββMk,dβ(C) denotes the analysis operator of F and it is given by TFβββ
f=(β¨f,fiββ©)iβIkββ.
3. 3.
SFββMdβ(C)+ denotes the frame operator of F and it is given by SFβ=TFβTFββ. Hence,
[TABLE]
4. 4.
We say that F is a frame for Cd if it spans Cd; equivalently, F is a frame for Cd if SFβ is a positive invertible operator acting on Cd.
β³
Hence, in case F={fiβ}iβIkββ is a frame for Cd we get the so-called canonical reconstruction formulas: for xβCd,
[TABLE]
In several applications of finite frame theory, it is important to construct families
F={fiβ}iβIkβββ(Cd)k in such a way that the frame operator SFβ and the squared norms (β₯fiββ₯2)iβIkββ
are prescribed in advance. This problem is known as the frame design problem, and its solution can be obtained in
terms of the Schur-Horn theorem for majorization.
Theorem 2.5** (See [2]).**
Let SβMdβ(C)+ and let a=(aiβ)iβIkβββ(R>0kβ). Then, the following statements are equivalent:
-
There exists F={fiβ}iβIkβββ(Cd)k such
that SFβ=S and β₯fiββ₯2=aiβ, for iβIkβ;
2. 2.
aβΊΞ»(S). β
3 Generalized frame operator distance functions
In this section we state our main problem namely, the study of the geometrical and spectral structure
of local minimizers of generalized frame operator distance (GFOD) functions. After recalling some preliminary results
from [19], we obtain a description of what we call the inner structure of local minimizers of GFODβs functions.
Since the proofs of some results in this section
are quite technical, they are developed in Section 5.
3.1 Statement of the problem and related results
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β. In this case we consider the torus
[TABLE]
By definition, Tdβ(a) is the (cartesian) product of spheres in Cd;
we endow Tdβ(a) with the product metric of the Euclidean metrics in each of these spheres, namely
[TABLE]
Thus, Tdβ(a) is a compact smooth manifold.
Given a strictly convex u.i.n N:Mdβ(C)βRβ₯0β, we can consider
the generalized frame operator distance (G-FOD) in Tdβ(a) (see [18]) given by
[TABLE]
where SGβ=βiβIkββgiββgiβ denotes the frame operator of a family GβTdβ(a).
This notion is based on the frame operator distance (FOD) Ξ(β₯β
β₯2β,S,a)β
introduced by Strawn in [23], where
β₯Aβ₯22β=tr(AβA) denotes the Frobenius norm, AβMdβ(C).
Based on his work and on numerical evidence, Strawn conjectured that local minimizers of
Ξ(β₯β
β₯2β,S,a)β
were also global minimizers. In [18] we settled Strawnβs conjecture in the affirmative, by relating
FOD problems in the norm β₯β
β₯2β with optimal frame completion problems for the Benedetto-Fickus frame potential.
It is then natural to ask whether local minimizers of the G-FOD Ξ(N,S,a)β are also global minimizers, where
N denotes an arbitrary strictly convex u.i.n. on Mdβ(C) (e.g. p-norms, with pβ(1,β)). Unfortunately, the techniques used in [18] do not apply in this general case, leaving untouched the
following
Problems 3.1**.**
Let SβMdβ(C)+, a=(aiβ)iβIkβββ(R>0kβ)β and fix a strictly convex u.i.n. N on Mdβ(C). Then
Compute the spectral and geometrical structure of local minimizers of Ξ(N,S,a)β
in Tdβ(a).
Determine whether local minimizers are global minimizers of Ξ(N,S,a)β
in Tdβ(a).
Determine whether these minimizers depend on the chosen u.i.n.
β
In what follows we completely solve the three problems above in an algorithmic way, thus settling in the affirmative
the questions in P2. and P3. (see Theorem 4.12 in Section 4.2).
Next, we recall some results from
[19] that we use throughout our work.
Theorem 3.2** (See [19]).**
Fix SβMdβ(C)+, a=(aiβ)iβIkβββ(R>0kβ)β,
and a strictly convex u.i.n. N on Mdβ(C).
Consider the map
Ξ(N,S,a)β=Ξ:Tdβ(a)βRβ₯0β given by Ξ(G)=N(SβSGβ).
Fix a local minimizer G0β={giβ}iβIkβββTdβ(a) of Ξ(N,S,a)β, with frame operator
S0β=SG0ββ. Denote by W=R(S0β)=span{giβ:Β iβIkβ}βCd.
Then,
-
There exists B={viβ}iβIdββ an ONB of Cd such that
[TABLE]
In particular, we have that \lambda(S-S_{0})=\big{(}\lambda(S)-\lambda(S_{0})\,\big{)}^{\downarrow}.
2. 2.
The subspace W reduces SβS0ββH(d); hence, DΒ =\mboxdefΒ (SβS0β)β£WββL(W) verifies Dβ=D.
3. 3.
All vectors giβ (iβIkβ) are eigenvectors of D and SβS0β.
4. 4.
Let Ο(D)={c1β,β¦,cpβ}
be such that c1β<c2β<β¦<cpβ. Denote by
[TABLE]
Then the subspaces Wjβ reduces both
S and S0β, for jβIpβ. Moreover,
[TABLE]
5. 5.
If jβIpβ and cjβξ =maxΟ(SβS0β) (for example, when 1β€j<p), then
the family {gββ}ββJjββ is linearly independent.
β
Remark 3.3**.**
With the notation of Theorem 3.2, if we assume that
[TABLE]
Indeed, if W=Cd then
Ο(SβS0β)={c1β,β¦,cpβ}. Otherwise dimW<dβ€k so, by items 4 and 5 of Theorem 3.2,
the family {giβ}iβJpββ can not be linearly independent (because the families
{giβ}iβJjββ are linearly independent for 1β€j<p, and all families are mutually orthogonal). By item 5 again,
we deduce that cpβ=maxΟ(SβS0β). β³
3.2 Inner structure of local minimizers of GFODβs
In this section, based on Theorem 3.2 above, we obtain a detailed description of
what we call the inner structure of local minimizers. In order to do this, we introduce the following
Notation 3.4**.**
Fix SβMdβ(C)+, a=(aiβ)iβIkβββ(R>0kβ)β and a strictly convex u.i.n. N on Mdβ(C). Also consider
the notions introduced in Theorem 3.2. As before, consider
-
Ξ(N,S,a)β=Ξ:Tdβ(a)βRβ₯0β given by Ξ(G)=N(SβSGβ).
2. 2.
A local minimizer G0β={giβ}iβIkβββTdβ(a) of Ξ(N,S,a)β, with frame operator
S0β=SG0ββ.
3. 3.
We denote by Ξ»=(Ξ»iβ)iβIdββ=Ξ»(S)β(Rβ₯0dβ)β and
ΞΌ=(ΞΌiβ)iβIdββ=Ξ»(S0β)β(Rβ₯0dβ)β.
4. 4.
We fix B={viβ}iβIdββ an ONB of Cd as in Theorem 3.2. Hence,
[TABLE]
5. 5.
We consider
W=R(S0β), D=(SβS0β)β£Wβ and
Ο(D)={c1β,β¦,cpβ} where c1β<c2β<β¦<cpβ.
6. 6.
Let sDβ=max{iβIdβ:ΞΌiβξ =0}=rkS0β.
7. 7.
We denote by Ξ΄=Ξ»βΞΌβRd
so that, by Eq. (5),
[TABLE]
Notice that Ξ΄ is constructed by pairing the entries
of ordered vectors (since Ξ»=Ξ»(S) and ΞΌ=Ξ»(S0β). Nevertheless, we have that Ξ»(SβS0β)=Ξ΄β.
In what follows we obtain some properties of (the unordered vector) Ξ΄.
8. 8.
For each jβIpβ, we consider the following sets of indexes:
[TABLE]
Theorem 3.2 assures that
IsDββ=βjβIpββDΒ Β Β Β Β Β βΒ KjβΒ andΒ Ikβ=βjβIpββDΒ Β Β Β Β Β βΒ Jjβ (disjoint unions).
9. 9.
By Eq. (2), R(S0β)=span{giβ:iβIkβ}=W=β¨iβIpββker(DβciβIWβ).
Then, for every j\in\mathbb{I}_{p}\,
[TABLE]
because giββker(DβcjβIWβ) for every iβJjβ.
Note that, by Theorem 3.2, each Wjβ reduces both S and S0β. β³
The next proposition describes the structure of the sets Jjβ and Kjβ for jβIpβ, as defined in
Notation 3.4. In turn, these sets play a central role in the proof of Theorem 3.8 below.
Proposition 3.5**.**
Let SβMdβ(C)+ and G0ββTdβ(a) be as in Notation 3.4. Then there exist indexes 0=s0β<s1β<β¦<spβ1β<spβ=rkS0ββ€d such that
[TABLE]
Proof.
See Section 5.
β
Remark 3.6**.**
Consider Notation 3.4 for SβMdβ(C)+ and a local minimizer G0ββTdβ(a) of the map Ξ(N,S,a)β.
Let
s0β=0<s1β<β¦<spββ€d, where spβ=rk(S0β), be as in Proposition 3.5.
In terms of these indexes we also get that Ξ»(SβS0β)=Ξ΄(S,a,G0β)β for
Ξ΄(S,a,G0β)=Ξ»(S)βΞ»(S0β), and
[TABLE]
or
[TABLE]
In the next result, we obtain a characterization of the indexes s1β<β¦<spβ2β and constants c1β<β¦cpβ1β
in terms of the index spβ1β (when p>1).
In the next section we complement these results and show the key role played by the index spβ1β and give
a characterization of cpβ. We begin by fixing some notation, which is independent of the norm N and the
local minimizer G0β.
β³
Notation 3.7**.**
Let SβMdβ(C)+, aβ(R>0kβ)β, Ξ»(S)=(Ξ»iβ)iβIdβββ(Rd)β
and m=min{k,d}.
-
We let hiβΒ =\mboxdefΒ Ξ»iββaiβ, for every iβImβ.
2. 2.
Given 1β€jβ€rβ€m, let
[TABLE]
We abbreviate P1,rβ=Prβ for the initial averages. β³
Theorem 3.8**.**
Consider Notation 3.4 for SβMdβ(C)+ and a local minimizer G0ββTdβ(a) of the map Ξ(N,S,a)β.
Assume further that p>1. Let
s0β=0<s1β<β¦<spββ€d be such that Eq. (6) holds.
Then, we have the following relations:
-
The index s_{1}=\max\,\big{\{}1\leq r\leq s_{p-1}\,:\,P_{r}=\min\limits_{1\leq i\leq s_{p-1}}\,P_{i}\,\big{\}}, and
c1β=Ps1ββ.
2. 2.
Recursively, if sjβ<spβ1β, then
[TABLE]
Proof.
See Section 5.
β
3.3 The co-feasible case for kβ₯d.
Throughout this section we assume that kβ₯d.
In [19] we showed that in some cases, local minimizers of G-FOD functions are also global minimizers. We recall this fact
in the following
Theorem 3.9** (See [19]).**
Consider Notation 3.4 with kβ₯d for SβMdβ(C)+ and a local minimizer G0ββTdβ(a)
of the map Ξ(N,S,a)β.
Assume further that p=1 i.e., that
there exists c=c1β that satisfies (SβS0β)giβ=cgiβ, for every iβIkβ.
Then there exists an ONB {viβ}iβIdββ of Cd such that
[TABLE]
where (Ξ»iβ)iβIdββ=Ξ»(S)β(Rβ₯0dβ)β.
Moreover,
G0β is a global minimizer of Ξ in Tdβ(a).
β
Corollary 3.10**.**
With the hypotheses and notation in Theorem 3.9 we have that:
-
The constant c=maxΟ(SβS0β) is the largest eigenvalue of SβS0β.
2. 2.
The eigenvalue Ξ»iβ(S0β)=(Ξ»iββc)+, for every iβIdβ.
3. 3.
The list of norms \mathbf{a}\prec\big{(}\,(\lambda_{i}-c)^{+}\big{)}_{i\in\mathbb{I}_{d}}\,. In particular
[TABLE]
Proof.
We are assuming that kβ₯d. Then Remark 3.3
assures that c=cpβ=maxΟ(SβS0β).
-
This is a direct consequence of Theorem 3.9 above and the fact that (Ξ»iβ)iβIdββ=Ξ»(S)β(Rd)β,
so that also \big{(}\,(\lambda_{i}-c)^{+}\big{)}_{i\in\mathbb{I}_{d}}\in(\mathbb{R}^{d})^{\downarrow}.
-
Since G0ββTdβ(a) (it is a family of vectors with norms given by a),
then Theorem 2.5 assures that
[TABLE]
The rest of the statement
is a direct consequence of this majorization relation.
β
The previous results motivate the following notion, which only depends on some Ξ»β(Rβ₯0dβ)β
and a=(aiβ)iβIkβββ(R>0kβ)β, with kβ₯d (and does not require any norm N nor a local minimizer G0β).
Definition 3.11**.**
Let Ξ»β(Rβ₯0dβ)β and a=(aiβ)iβIkβββ(R>0kβ)β, with kβ₯d. We say that the pair (Ξ»,a) is co-feasible if
there exists a constant
[TABLE]
In this case,
the co-feasibility constant c is uniquely determined by
tr(a)=iβIdβββ(Ξ»iββc)+.
β³
Proposition 3.12**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β with kβ₯d. Then the pair (Ξ»(S),a) is co-feasible if and only if
the following conditions hold:
-
There exist G={giβ}iβIkβββTdβ(a) and cβR such that (SβSGβ)giβ=cgiβ, for every iβIkβ.
2. 2.
*This constant c=maxΟ(SβSGβ). *β³
Proof.
Assume that there exist
cβR and G={giβ}iβIkβββTdβ(a) which satisfy items 1 and 2.
By Eq. (2), W=R(SGβ)=span{giβ:iβIkβ}. Since
(S-S_{\mathcal{G}})\big{|}_{W}=c\,I_{W}\,, then S(W)βW. Let r=dimW.
Then, considering separately the eigenvalues of S\big{|}_{W}
and S\big{|}_{W^{\perp}}=(S-S_{\mathcal{G}})\big{|}_{W^{\perp}}\,, the fact that
c=maxΟ(SβSGβ) implies that
[TABLE]
Therefore
\lambda(S_{\mathcal{G}})=\lambda(S-(S-S_{\mathcal{G}})\,)=\big{(}\,(\lambda_{i}(S)-c)^{+}\big{)}_{i\in\mathbb{I}_{d}}.
Hence, arguing as in the proof of Corollary
3.10, we conclude that this c satisfies Eq. (10). Note that c<Ξ»1β(S) because traξ =0.
Conversely, if there exists c which satisfies Eq. (10), let B={viβ}iβIdββ be an ONB for Cd such that
[TABLE]
By Theorem 2.5, there exists G={gjβ}jβIkβββTdβ(a) such that S0β=SGβ.
Note that
[TABLE]
Then cβ₯maxΟ(SβSGβ).
If we let
[TABLE]
then {0}ξ =WΒ =\mboxdefΒ R(SGβ)=span{viβ:Β iβIrβ}, and it
satisfies that (S-S_{\mathcal{G}})\big{|}_{W}=c\,I_{W}\,. The proof finishes by noticing that, by Eq. (2),
gjββW and hence (SβSGβ)gjβ=cgjβ for every jβIkβ.
β
Remark 3.13**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β (with kβ₯d) such that the pair (Ξ»(S),a) is co-feasible.
Let G={giβ}iβIkβββTdβ(a) and cβR be as in the proof of the
second part of Proposition 3.12.
Then, by Theorem 3.9, G is a global (and local)
minimizer of the map Ξ(N,S,a)β, with p=1. Nevertheless, a priori this fact does not imply that
every local minimizers should have the same structure (namely, to have also p=1). We shall prove soon
that the spectral structure of local minimizers is indeed unique (in general, and then also in the co-feasible cases).
β³
It is worth pointing out that there are GFOD problems that are not co-feasible. In order to see this
we include the following:
Example 3.14**.**
Consider SβM4β(C)+ be such that Ξ»:=Ξ»(S)=(2,2,1,1)β(R>04β)β and let a=(3,1,1,1)β(R>04β)β. Then, the pair
(Ξ»,a) is not co-feasible. Indeed, the unique solution c<2 to the equation 6=tr(a)=2(2βc)++2(1βc)+ is c=0.
Thus ((Ξ»iββc)+)iβI4ββ=Ξ». But it can be easily checked that aξ βΊΞ». β³
Although in general, given SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β, the pair (Ξ»(S),a) corresponding to this data is
not co-feasible, the GFOD problems contain a co-feasible part. Indeed, if we further consider a strictly convex u.i.n. N in Mdβ(C),
then local minimizers of Ξ(N,S,a)β allow us to locate such co-feasible parts. In order to describe
this situation, we introduce the following
Definition 3.15**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β with kβ₯d. For rβIdβ1ββͺ{0} we consider the truncated data
[TABLE]
We say that r is a co-feasible index for S and a if the pair
(Ξ»(r)(S),a(r)) is co-feasible
(according to Definition 3.11 with dimensions dβrβ€kβr).
β³
Remark 3.16**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β with kβ₯d. Let B={viβ}iβIdββ be an ONB for Cd such that Sviβ=Ξ»iβ(S)viβ for iβIdβ.
Then, by Proposition 3.12, an index rβIdβ1ββͺ{0} is co-feasible if and only if the conditions 1 and 2 of Proposition 3.12 hold for the space
Vrβ=span{viβ:Β r+1β€iβ€d}, the positive operator and Srβ=Sβ£VrβββL(Vrβ)
and the vector of norms a(r)=(ar+1β(S),β¦,akβ)β(R>0kβrβ)β.
This means that there exist cβR and
[TABLE]
such that (SrββSGβ)giβ=cgiβ, for every iβIkβrβ, and c=maxΟ(SrββSGβ).
Note that this statement seems to depend on the basis B.
But actually, the list of eigenvalues Ξ»(Srβ)=Ξ»(r)(S)β(Rβ₯0dβrβ)β, so it does not depend on B.
β³
The next result complements Theorem 3.8.
Proposition 3.17**.**
Consider Notation 3.4 with kβ₯d for SβMdβ(C)+ and a local minimizer G0ββTdβ(a) of the map Ξ(N,S,a)β.
Let 0=s0β<s1β<β¦<spβ1β<spββ€d be as in Proposition 3.5.
Then cpβ=maxΟ(SβSG0ββ) and spβ1β is a co-feasible index for S and a.
In particular, the constant cpβ and the index spβ=rkSG0ββ are uniquely determined by
the equations
[TABLE]
Proof.
Let S0β=SG0ββ.
Note that cpβ=maxΟ(SβS0β) by Remark 3.3, since we are assuming that kβ₯d.
In order to show that spβ1β is a co-feasible index we shall use Remark 3.16. Let r=spβ1β. Recall from
Notation 3.4 and Proposition 3.5 that
Jpβ={iβIkβ:(SβS0β)giβ=cpβgiβ}={r+1,β¦,k} and that
Wpβ=span{giβ:iβJpβ}=span{vjβ:r+1β€jβ€spβ}.
Since
[TABLE]
Then, Grβ={giβ}i=r+1kββTVrββ(a(r))=Tkβrβ(a(r))β©Vrkβrβ
is such that SGrββ=S0ββ£Vrββ (here we use that, by Eq. (3),
gjββWpβ₯β for every jβ/Jpβ). So that, if PMβ denotes the orthogonal projection onto a subspace MβCn,
[TABLE]
Hence maxΟ(Sβ£VrβββSGβ²β)β€maxΟ(SβS0β)=cpβ and, by Remark 3.16,
spβ1β=r is a co-feasible
index for S and a. Then, by Definition 3.15,
spβ and cpβ are determined by Eq. (12).
β
Remark 3.18**.**
Consider Notation 3.4 with kβ₯d for SβMdβ(C)+ and a local minimizer G0ββTdβ(a) of the map Ξ(N,S,a)β.
Taking into account all objects and facts detailed in Notation 3.4, Remark 3.6, Theorem 3.8, Eq. (11) and
Proposition 3.17, we conclude that
Ξ»(SβSG0ββ)=Ξ΄(S,a,G0β)β, with
[TABLE]
or \delta(S\,,\,\mathbf{a}\,,\,\mathcal{G}_{0})=\big{(}\min\{\lambda_{i}(S)\,,\,c_{1}\}\big{)}_{i\in\mathbb{I}_{d}} (if p=1, the co-feasible case),
where all data in this formula
can be explicitly computed in terms of S, a and the index spβ1β.
Indeed, this expression depends on G0β and
N only through the index spβ1β which determines the previous
indexes and constants by Theorem 3.8, and the co-feasible part which begins at
spβ1β, so it determines spβ and cpβ, by Proposition 3.17 via Eq. (12).
Hence we shall denote spβ1β=spβ1β(G0β) .
β³
We end this section with the following result, which compares the co-feasibility constants corresponding to different
co-feasible indexes.
Corollary 3.19**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β with kβ₯d and assume that r,sβIdβ1β are co-feasible indexes for S and a. Denote by c(s) and c(r) their co-feasibility constants.
Then,
[TABLE]
Proof.
By Proposition 3.12,
\mathbf{a}^{(s)}\prec\big{(}\,(\lambda_{i}(S)-c(s)\,)^{+}\big{)}_{i=s+1}^{d}
and \mathbf{a}^{(r)}\prec\big{(}\,(\lambda_{i}(S)-c(r)\,)^{+}\big{)}_{i=r+1}^{d}\,.
Then
[TABLE]
Therefore,
[TABLE]
But if c(s)<c(r) then (Ξ»iβ(S)βc(s))+β₯(Ξ»iβ(S)βc(r))+ for every iβIdβ,
and moreover, we have that
(Ξ»r+1β(S)βc(s))+>(Ξ»r+1β(S)βc(r))+ because
i=r+1βkβaiβ>0βΉ\eqrefrysβc(r)<Ξ»r+1β(S).
β
4 Main results
In this section we state and prove our main result namely, that local minimizers of GFODβs are actually global minimizers.
This is achieved by considering in detail the results obtained in Section 3 related with the spectral structure of local minimizers
of GFODβs functions, and the notion of co-feasible index. We first consider the case when kβ₯d.
4.1 When kβ₯d
Throughout this subsection we assume that kβ₯d.
Notice that Eqs. (7) and (8) together with
Theorem 3.8 and Proposition 3.17 give a detailed description of the spectral structure of
local minimizers of GFOD problems. With the notation of these results, it is worth pointing out the key role played by the (co-feasible) index spβ1β
in the determination of the complete spectral structure of SβS0β and S0β (see Definition 3.11).
The basic idea for what follows is to replace spβ1β by an arbitrary co-feasible index r, to reproduce the algorithm given in
Theorem 3.8 and get indexes and constants in terms of r (which a priori
are not associated to any minimizer G0β). Then, we shall
show that there exists a unique βcorrectβ
index r (i.e. co-feasible and admissible, see Definition
4.1 below)
which only depends on Ξ»(S) and a, so that it
must coincide with spβ1β(G0β).
Definition 4.1**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β. For a co-feasible index rβIdβ1ββͺ{0}
let q=q(r)βIdβ,
0=s0β(r)<s1β(r)<β¦<sqβ1β(r)=r<sqββ€dβ€k and c1β(r),β¦,cqβ(r)
be computed according to the following recursive algorithm (which only depends on r, Ξ»(S) and a):
-
If r=0, set q=q(r)=1 and s0β(r)=sqβ1β(r)=r=0 (and go to item 4.).
2. 2.
If r>0, using the numbers Pi,jβ defined in Notation 3.7,
the index
[TABLE]
3. 3.
If the index sjβ(r) is already computed and sjβ(r)<r, then
[TABLE]
and cj+1β(r)=Psjβ(r)+1,sj+1β(r)β.
4. 4.
If sjβ(r)=r, we set q=q(r)=j+1 (so that sqβ1β(r)=r),
and we define
cqβ(r) and sqβ(r) (with
cqβ(r)<Ξ»r+1β and r=sqβ1β(r)<sqβ(r)β€d) that are uniquely determined by
[TABLE]
[TABLE]
In particular, sqβ(r)=max{iβIdβ:Ξ»iβ(S)βcqβ(r)>0} since Ξ»(S)=Ξ»(S)β.
5. 5.
If r>0 we denote by Ξ΄(Ξ»(S),a,r)βRd the vector given by
[TABLE]
- and \delta(\lambda(S)\,,\,\mathbf{a}\,,\,0)=\big{(}\,\min\{\lambda_{i}(S)\,,\,c_{1}(0)\}\,\big{)}_{i\in\mathbb{I}_{d}}\,.
It is easy to see (by construction) that
[TABLE]
Finally, we shall say that the index r is admissible if r=0 or r>0 and cqβ1β(r)<cqβ(r).
β³
Remark 4.2**.**
Consider a fixed strictly convex u.i.n. N in Mdβ(C). Let G0ββTdβ(a)
be a local (or global) minimizer of
Ξ(N,S,a)β=Ξ:Tdβ(a)βRβ₯0β.
Assume that kβ₯d.
We can apply the previous results to G0β; thus, we consider
pβ₯1 and constants c1β<β¦<cpβ and indexes s0β=0<s1β<β¦<spββ€d
as in Theorem 3.8 and Proposition 3.17. In particular, we get that spβ1β
is a co-feasible index which is also admissible since, if spβ1β>0, then
cpβ1β<cpβ by definition
(see Theorem 3.2).
The idea of what follows is to show that
spβ1β (denoted spβ1β(G0β) in Remark 3.18)
is the unique index which has both properties (for any norm N).
First, we need to verify some properties of the vector
Ξ΄(Ξ»(S),a,r) for a co-feasible and admissible index.
β³
Proposition 4.3**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β (with kβ₯d).
Let rβIdβ1ββͺ{0} be a co-feasible index. Then,
with p=q(r), sjβ=sjβ(r), cjβ=cjβ(r) for jβIpβ, and Ξ΄=Ξ΄(Ξ»(S),a,r)
as in Definition 4.1, we have that:
-
If p>1 then c1β<β¦<cpβ1β.
If we also assume that r is admissible, then
cpβ1β<cpβ=iβIdβmaxβΞ΄iβ and:
-
Ξ»spβ1β+1ββ₯Ξ»spββ>cpβ* and Ξ»iβ(S)>cjβ , for every sjβ1β+1β€iβ€sjβ and jβIpβ1β. Then*
[TABLE]
2. 3.
If p>1 then (a_{i})_{i=s_{j-1}+1}^{s_{j}}\prec\big{(}\,\lambda_{i}(S)-c_{j}\big{)}_{i=s_{j-1}+1}^{s_{j}}\in\mathbb{R}_{>0}^{s_{j}-s_{j-1}}, for
every jβIpβ1β.
3. 4.
(a_{i})_{i=s_{p-1}+1}^{k}\prec\big{(}\,(\lambda_{i}(S)-c_{p})^{+}\big{)}_{i=s_{p-1}+1}^{d}\in\mathbb{R}_{\geq 0}^{d-s_{p-1}}.
Proof.
- The case p=2 is trivial.
If p>2, assume that there exists jβIpβ2β such that
cjββ₯cj+1β. Then, notice that
[TABLE]
which contradicts the definition of sjβ in Eq. (15),
since sj+1ββ€spβ1β=r. Thus, c1β<β¦<cpβ1β.
If r=spβ1β is an admissible index, then
cpβ1β<cpβ=iβIdβmaxβΞ΄iβ by definition and Eq. (18).
By Eq. (17),
we have that cpβ<Ξ»iβ(S) for spβ1β+1β€iβ€spβ.
Therefore, if
[TABLE]
since iβ€sjββ€spβ1β<spβ1β+1 and Ξ»(S)β(Rd)β.
- For jβIpβ1β and sjβ1β+1β€mβ€sjβ, we have that
[TABLE]
(the equivalence also holds for equalities).
Using the definition of cjβ (item 2. of Definition 4.1),
we see that the inequalities to the right in Eq. (21) hold for every such index m, with equality
for m=sjβ (by definition of cjβ and sjβ).
We have proved that (a_{i})_{i=s_{j-1}+1}^{s_{j}}\prec\big{(}\,\lambda_{i}(S)-c_{j}\big{)}_{i=s_{j-1}+1}^{s_{j}}\,.
Item 4 follows immediately from the fact that r=spβ1β is a co-feasible index
(see Definition 3.15).
β
Corollary 4.4**.**
*Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β (with kβ₯d).
Let rβIdβ1ββͺ{0} be a co-feasible index which is also admissible. Then
aβΊΞ»(S)βΞ΄(Ξ»(S),a,r).
*
Proof.
The relation aβΊΞ»(S)βΞ΄(Ξ»(S),a,r) follows from
items 3 and 4 of Proposition 4.3, since xβΊy and
zβΊwβΉ(x,z)βΊ(y,w) (Remark 2.2).
β‘
Theorem 4.5**.**
Let SβMdβ(C)+ and a=(aiβ)iβIkβββ(R>0kβ)β (with kβ₯d).
Then there is a unique co-feasible
and admissible index sβIdβ1ββͺ{0}, and this s is the minimal co-feasible index.
Proof.
Assume that there exist two co-feasible indexes 0β€s<rβ€dβ1 such that
r is admissible. We show that this leads to a contradiction.
Indeed, let s0β=0<s1β<β¦<spβ1β=r<spββ€d and c1β<β¦<cpβ be the indexes and constants corresponding to
Definition 4.1, for the index r
(i.e., we rename p=q(r),Β sjβ=sjβ(r) and cjβ=cjβ(r) for jβIpβ). Let λ =\mboxdefΒ Ξ»(S)β(Rβ₯0dβ)β and consider
[TABLE]
Similarly, consider q=q(s) and
s0ββ=0<s1ββ<β¦<sqβ1ββ=s<sqβββ€d and c1ββ<β¦<cqβ1ββ and cqββ be the indexes and constants corresponding to
Definition 4.1, for the index s. We also consider
[TABLE]
If Ξ΄β=Ξ΄ then by Eqs. (17), (22)
and (23),
sqββ=spβ=max{iβIdβ:Ξ΄iβ<Ξ»iβ(S)}, and
cqββ=cpβ=Ξ΄spββ. But in this case
r=spβ1β=min{iβIdβ1β:Ξ΄i+1β=cpβ}β€sqβ1ββ=s, a contradiction. Hence
Ξ΄βξ =Ξ΄.
Case 1.
Assume that there exists 1β€jβ€min{pβ1,qβ1} such that
[TABLE]
Next we show that this leads to a contradiction (Ξ΄β=Ξ΄). Indeed,
since sjβ1β=sjβ1ββ, by construction
[TABLE]
and
[TABLE]
Using that the limits s<r, then sjβ1β+1β€ββ€sminβPsjβ1β+1,βββ₯sjβ1β+1β€ββ€rminβPsjβ1β+1,ββ.
Since sjββξ =sjβ, this fact easily shows
that
[TABLE]
On the other hand, by Corollary 3.19 we have that
cqββ=cqβ(s)β₯cpβ(r)=cpβ, since they are the co-feasible constants corresponding to the co-feasible indexes sqβ1ββ=s<r=spβ1β.
With these facts we can compare Ξ΄ and Ξ΄β:
We have that
Ξ΄iβ=Ξ΄iββ for 1β€iβ€sjβ1β=sjβ1ββ
by hypothesis.
By Eq. (22), (23),
and item 1 of Proposition 4.3 (cjββ<β―<cqβ1ββ),
[TABLE]
Since cpβ=max{Ξ΄jβ:jβIdβ} by
Proposition 4.3 (r is admissible), then
[TABLE]
Finally, Ξ΄iββ=Ξ»iββ₯Ξ΄iβ, for sqββ<iβ€d
(item 2 in Proposition 4.3).
Therefore Ξ΄β©½Ξ΄β. Since
tr(Ξ΄)=tr(S)βtr(a)=tr(Ξ΄β) by Eq. (19),
we get that Ξ΄=Ξ΄β, a contradiction.
Case 2. If we assume that pβ€q and sjβ=sjββ (and hence cjβ=cjββ) for 0β€jβ€pβ1,
then
[TABLE]
Case 3. Finally, if
q<p and sjβ=sjββ (and hence cjβ=cjββ) for 0β€jβ€qβ1,
then we have that Ξ΄iβ=Ξ΄iββ for 1β€iβ€sqβ1ββ=sqβ1β.
Then, by Proposition 4.3, we have that
[TABLE]
Hence, Ξ΄β€Ξ΄β. Using that tr(Ξ΄)=tr(Ξ΄β), also in this case we conclude that Ξ΄=Ξ΄β.
The proof finishes once we notice that one of these three cases should occur.
β
Definition 4.6**.**
Let SβMdβ(C)+ with Ξ»=Ξ»(S)β(Rβ₯0dβ)β and a=(aiβ)iβIkβββ(R>0kβ)β (with kβ₯d).
If sβIdβ1ββͺ{0} is the unique
co-feasible and admissible index for S and a (which exists by Remark 4.2), then we denote by
Ξ΄(Ξ»,a)Β =\mboxdefΒ Ξ΄(Ξ»,a,s) as in Eq. (18) of
Definition
4.1. β³
Remark 4.7**.**
With the notation of Definition 4.6 above, notice that the vector Ξ΄(Ξ»,a) can be computed using
a fast algorithm. Indeed, the notion of co-feasible and admissible index is algorithmic and can be checked using
a fast routine; once the unique co-feasible and admissible index is computed, the vector Ξ΄(Ξ»,a) can also be computed
using a fast algorithm (Definition 4.1). β³
Theorem 4.8**.**
Let SβMdβ(C)+ with Ξ»=Ξ»(S)β(Rβ₯0dβ)β, a=(aiβ)iβIkβββ(R>0kβ)β (with kβ₯d) and
Ξ΄(Ξ»,a) as in Definition 4.6. If N is a strictly convex u.i.n. in Mdβ(C) and
G0ββTdβ(a)
then, the following statements are equivalent:
-
G0ββTdβ(a)* is a global minimizer of Ξ(N,S,a)β;*
2. 2.
G0ββTdβ(a)* is a local minimizer of Ξ(N,S,a)β;*
3. 3.
Ξ»(SβSG0ββ)=Ξ΄(Ξ»,a)β.
Hence, the global (and local) minimizers are the same for every strictly convex u.i.n. N.
Proof.
Clearly, 1.β2. In order to see 2.β3., we recall Remarks 3.6, 3.18
and 4.2, where we have seen that Ξ»(SβSG0ββ)=Ξ΄(S,a,G0β)β,
for the vector
Ξ΄(Ξ»,a,G0β) given in Eq. (13) and
completely determined by the index called spβ1β(G0β). By Remark 4.2 and
Theorem 4.5, this spβ1β(G0β) is
the unique co-feasible and admissible index of Theorem 4.5. Therefore, by Equations (13) and (18),
[TABLE]
3.β1.
Notice that Ξ is a continuous function defined on a compact metric space, so then there exists
G1ββTdβ(a) that is a global minimizer of Ξ and, in particular, a local minimizer.
By the already proved 2.β3., we must have that Ξ»(SβSG1ββ)=Ξ΄(Ξ»,a)β=Ξ»(SβSG0ββ).
In particular, since N is unitarily invariant
[TABLE]
where DΞ΄(Ξ»,a)ββMdβ(C) denotes the diagonal matrix with main diagonal Ξ΄(Ξ»,a).
β
We end this section with the following examples.
Example 4.9**.**
Consider B={e1β,e2β} the canonical basis of C2. Let
S=3e1ββe1β+e2ββe2ββM2β(C)+ and a=(1,1)
(i.e. k=d=2).
Then S is an invertible operator. Consider the vectors g1β=g2β=e1β,
and G0β={g1β,g2β}βT2β(a). Then Ξ»(SβSG0ββ)=Ξ»(e1ββe1β+e2ββe2β)=(1,1).
If GβT2β(a) is arbitrary, then
trΞ»(SβSGβ)=trSβtrSGβ=2. Hence
[TABLE]
by Remark 2.2 and Theorem 2.4.
Then Ξ(N,S,a)β(G0β)β€Ξ(N,S,a)β(G), for every u.i.n. N.
Thus, G0β={e1β,e1β} is a global minimizer of Ξ(N,S,a)β in T2β(a).
Therefore this problem is co-feasible, so that p=1, s1β=rkSG0ββ=1 and c1β=Ξ»2β(S)=1.
Notice that in this case G0β is not a frame for C2 (even when SβM2β(C)+ is invertible and kβ₯d). β³
Example 4.10**.**
Consider B={e1β,e2β} the canonical basis of C2. Let S=e1ββe1ββM2β(C)+ and a=(2,1) (with k=d=2 again).
Then S is a non-invertible operator. We shall see that
G0β={2e1β,e2β}βT2β(a) is a global minimizer of Ξ(N,S,a)β,
for every u.i.n. N.
Indeed,
[TABLE]
and, if GβT2β(a)
is arbitrary, then trΞ»(SβSGβ)=1β3=β2, so that
trs(SβSGβ)β₯2. This last fact implies that
s(SβSG0ββ)βΊwβs(SβSGβ) and therefore Ξ(N,S,a)β(G0β)β€Ξ(N,S,a)β(G).
Also this problem is co-feasible, with p=1, s1β=rkSG0ββ=2
and c1β=β1.
Notice that in this case G0β is a frame
for C2 (even when SβM2β(C)+ is not an invertible operator). β³
4.2 The general case
So far, we have considered the case of local minimizers of GFOD functions when the number of vectors k is greater than or equal to
the dimension of the space d. This was essentially needed in Section 3.3.
In this section we add the case when k<d, thus covering all possible cases. Our approach is based on a reduction to the case considered in Section 4.1.
Definition 4.11**.**
Let SβMdβ(C)+ and let a=(aiβ)iβIkβββ(R>0kβ)β with k<d.
Let B={viβ}iβIdββ be an ONB of Cd such that
S=βiβIdββΞ»iβ(S)Β viββviβ. Let
[TABLE]
Since k=dimVkβ (the βnew dβ) we can take
Ξ΄(Ξ»(Skβ),a)βRk using Definition 4.6,
for the data Ξ»(Skβ)=(Ξ»1β(S),β¦,Ξ»kβ(S))β(Rβ₯0kβ)β and aβ(R>0kβ)β. We define the vector
[TABLE]
which does not really depends on Skβ and B, but only on Ξ»(S) and a.
β³
Theorem 4.12**.**
Let SβMdβ(C)+, let a=(aiβ)iβIkβββ(R>0kβ)β
and let N be a strictly convex u.i.n. in Mdβ(C).
Given G0β={giβ}iβIkβββTdβ(a) the following are equivalent:
-
G0β* is a global minimizer of Ξ(N,S,a)β;*
2. 2.
G0β* is a local minimizer of minimizer of Ξ(N,S,a)β;*
3. 3.
Ξ»(SβSG0ββ)=Ξ΄(Ξ»(S),a)β*
(see Definition 4.6 if kβ₯d, and
Definition 4.11 if k<d).*
Proof.
If kβ₯d this is Theorem 4.8. Let us assume that k<d.
Clearly 1.β2. If we assume 2 we can apply Theorem 3.2, Proposition 3.5
and Theorem 3.8
(these statements do not assume that kβ₯d).
With the notation of these results (i.e., with Notation 3.4),
there exists B={viβ}iβIdββ an ONB of Cd such that
[TABLE]
We have that
rΒ =\mboxdefΒ rkS0ββ€k, and W=R(SG0ββ)=span{viβ}iβIrβββspan{viβ}iβIkββ=Vkβ,
as in Definition 4.11.
Since Ξ»iβ(S0β)=0 for i>k, the vector
\delta\ \stackrel{{\scriptstyle\mbox{\tiny{def}}}}{{=}}\ \big{(}\,\lambda_{i}(S)-\lambda_{i}(S_{0})\,\big{)}_{i\in\mathbb{I}_{k}}\in\mathbb{R}^{k}
satisfies that
[TABLE]
With the notation of Definition 4.11, we have to prove that
Ξ΄=Ξ΄(Ξ»(Skβ),a). Since r=spββ€k<d, we can apply
Remark 3.6 (to Ξ»(S)βΞ»(S0β)βRd), so that
[TABLE]
or
[TABLE]
where the indexes s1β<β¦<spβ2β and constants c1β<β¦cpβ1β
are constructed (for Ξ»(SβSG0ββ) and therefore also for Ξ΄)
in terms of the index spβ1β (when p>1) using the algorithm given in
Theorem 3.8 (and also
in Definition 4.1, with respect to
Ξ»(Skβ), a and spβ1β).
Also cpβ1β<cpβ by Theorem 3.2.
Therefore, in order to show that Ξ΄=Ξ΄(Ξ»(Skβ),a),
by Theorem 4.5 we just need to prove that
the index spβ1ββIkβ1ββͺ{0} is co-feasible (and admissible) with respect to Skβ and a.
By Theorems 3.2 and 3.5 we know that
(SβS0β)giβ=cpβgiββΊspβ1β+1β€iβ€k, and
[TABLE]
Hence, if we let X=span{viβ:spβ1β+1β€iβ€kΒ } and
Gpβ={giβ}i=spβ1β+1kββTXβ(a(spβ1β))
then
[TABLE]
By Remark 3.16 (for Skβ and a),
we only need to show that cpβ=spβ1β+1β€iβ€kmaxβΒ Ξ΄iβΒ (=iβIkβmaxβΒ Ξ΄iβΒ ).
Suppose that cpβ<maxΟ(SβS0β). Then, by item 5 of Theorem 3.2,
the set G0β is linearly independent
(since each set {gjβ}jβJjββ is linearly independent, and
they are sets of eigenvectors of the different eigenvalues cjβ). Then
spβ=rkS0β=k, so we can apply Eq. (27), and
automatically cpβ=iβIkβmaxβΒ Ξ΄iβ.
Otherwise we have that cpβ=maxΟ(SβS0β)β₯iβIkβmaxβΒ Ξ΄iβ.
Then, in any case cpβ=iβIkβmaxβΒ Ξ΄iβ. We have proved that
the index spβ1β is co-feasible (and also admissible, because cpβ1β<cpβ)
with respect to Skβ and a.
Then Ξ΄=Ξ΄(Ξ»(Skβ),a) by Theorem 4.5 and
Ξ»(SβSG0ββ)=Ξ΄(Ξ»(S),a)β by Eq. (25).
3.β1. An argument analogous to that in the proof of Theorem 4.8 (3.β1.) proves this implication.
β
Remark 4.13**.**
The proof of 2.β3. of Theorem 4.12 becomes trivial if we assume
that (the vectorial version of) the norm N satisfies that, for x,yβRk and zβRdβk,
[TABLE]
since in this case G0β is still a local minimizer for Skβ and a in Vkβ.
The most usual strictly convex norms (for example p-norms, for pβ(1,β)) satisfy
Eq. (28), but this property fails in general. Take N=β₯β
β₯ββ+β₯β
β₯2β
which is a strictly convex UIN. In this case, if r=\frac{\sqrt{2}}{2}\, then (d=3,k=2)
[TABLE]
Corollary 4.14**.**
With the notation of Theorem 4.12, we have that
[TABLE]
Proof.
For hβIdβ and Ξ΅>0 let
[TABLE]
Then, N(h,Ξ΅)β is a strictly convex u.i.n. in Mdβ(C) such
that Ξ΅β0+limβN(h,Ξ΅)β(A)=N(h)β(A), for AβMdβ(C).
If we let G0ββTdβ(a) be such that Ξ»(SβSG0ββ)=Ξ΄(Ξ»,a)β then, by Theorem 4.12,
[TABLE]
Since this occurs for every hβIdβ, then β£Ξ΄(Ξ»,a)β£βΊwββ£Ξ»(SβSGβ)β£.
β
5 Proof of some technical results
In this section we prove some results stated in Section 3.2.
We begin by re-stating Notation 3.4, that we will use again
throughout this section.
Notation 3.4 (repeated).
Fix SβMdβ(C)+, a=(aiβ)iβIkβββ(R>0kβ)β, and a strictly convex u.i.n. N on Mdβ(C). Also consider
the notions introduced in Theorem 3.2. As before, let
-
Ξ(N,S,a)β=Ξ:Tdβ(a)βRβ₯0β given by Ξ(G)=N(SβSGβ).
2. 2.
A local minimizer G0β={giβ}iβIkβββTdβ(a) of Ξ(N,S,a)β, with frame operator
S0β=SG0ββ.
3. 3.
We denote by Ξ»=(Ξ»iβ)iβIdββ=Ξ»(S)β(Rβ₯0dβ)β and
ΞΌ=(ΞΌiβ)iβIdββ=Ξ»(S0β)β(Rβ₯0dβ)β.
4. 4.
We fix B={viβ}iβIdββ an ONB of Cd as in Theorem 3.2. Hence,
[TABLE]
5. 5.
We consider
W=R(S0β), D=(SβS0β)β£Wβ and
Ο(D)={c1β,β¦,cpβ} where c1β<c2β<β¦<cpβ.
6. 6.
Let sDβ=max{iβIdβ:ΞΌiβξ =0}=rkS0β.
7. 7.
We denote by Ξ΄=Ξ»βΞΌβRd
so that
[TABLE]
Notice that Ξ΄ is constructed by pairing the entries
of ordered vectors (since Ξ»=Ξ»(S) and ΞΌ=Ξ»(S0β)βNevertheless, we have that Ξ»(SβS0β)=Ξ΄β.
In what follows we obtain some properties of (the unordered vector) Ξ΄.
8. 8.
For each jβIpβ, we consider the following sets of indexes:
[TABLE]
Theorem 3.2 assures that
IsDββ=βjβIpββDΒ Β Β Β Β Β βΒ KjβΒ andΒ Ikβ=βjβIpββDΒ Β Β Β Β Β βΒ Jjβ (disjoint unions).
9. 9.
By Eq. (2), R(S0β)=span{giβ:iβIkβ}=W=β¨iβIpββker(DβciβIWβ)
then, for every j\in\mathbb{I}_{p}\,
[TABLE]
because giββker(DβcjβIWβ) for every iβJjβ.
Note that, by Theorem 3.2, each Wjβ reduces both S and S0β. β³
In order to prove Proposition 3.5 we first present the following two results.
Proposition 5.1**.**
Let SβMdβ(C)+ and let G0ββTdβ(a) be as in Notation 3.4 and assume that p>1.
Assume that there exist
[TABLE]
Then, there exists a
continuous curve G(t):[0,1)βTdβ(a) such that G(0)=G0β and
Ξ»(SβSG(t)β)βΊΞ»(SβS0β) with strict majorization for tβ(0,Ξ΅) for some Ξ΅>0.
Proof.
Consider
[TABLE]
(note that β¨whβ,wlββ©=0 because
β¨ghβ,glββ©=0). Now define, for tβR and for some convenient Ξ³βRβ{0} (which will be explicitly calculated later),
[TABLE]
Then consider the family GΞ³β(t), which is obtained from G0β by
replacing the vectors ghβ and glβ by ghβ(t) and glβ(t) respectively, and denote by SΞ³β(t) its frame operator.
Note that GΞ³β(t)βTdβ(a) for every tβR and GΞ³β(0)=G0β.
Let Wh,lβ=span{whβ,wlβ}, this subspace reduce
both SβS0β and SβSΞ³β(t). The fact that ghβ(t),glβ(t)βWh,lβ, allows us to
represent the following matrix with respect to the basis {whβ,wlβ} of Wh,lβ,
[TABLE]
[TABLE]
Then,
[TABLE]
Hence (SβS0β)β£Wh,lβ₯ββ=(SβSΞ³β(t))β£Wh,lβ₯ββ. On the other hand
(SβS0β)β£Wh,lββ=(ciβ0β0crββ)
and
[TABLE]
Since tr(AΞ³β(t))=ciβ+crβ for every tβR, then we have the strict majorization
Ξ»(AΞ³β(t))βΊ(crβ,ciβ) if and only if β₯AΞ³β(t)β₯22β<cr2β+ci2β.
So consider the function mΞ³β:RβR given by
[TABLE]
Notice that AΞ³β(0)=(SβS0β)β£Wh,lββ, then mΞ³β(0)=tr((SβS0β)β£Wh,lβ2β)=cr2β+ci2β.
The next step is to find a convenient Ξ³βRβ{0} such that mΞ³β²β(0)=0 but mΞ³β²β²β(0)<0;
in this case we obtain
the strict
majorization Ξ»(AΞ³β(t))βΊ(crβ,ciβ) for tβ(0,Ξ΅), for some Ξ΅>0.
This last fact implies that
Ξ»(SβSΞ³β(t))βΊΞ»(SβS0β) strictly, for tβ(0,Ξ΅), as desired.
Start computing the derivatives of the entries aijβ(t) of AΞ³β(t), for 1β€i,jβ€2:
[TABLE]
Then
[TABLE]
Note that mΞ³β²β²β(0) is a quadratic function depending on Ξ³ whose discriminant is
[TABLE]
because we assume that ahββ€alβ (and we have that crβ>ciβ),
[TABLE]
Then, there exists Ξ³βRβ{0} such that mΞ³β²β²β(0)<0.
β
The following result together with Proposition 5.1 will allow us to obtain a proof of Proposition 3.5 (see below).
Proposition 5.2**.**
Let SβMdβ(C)+ and let G0ββTdβ(a) be as in Notation 3.4 and assume that p>1.
Assume that there exist
[TABLE]
In this case, we construct a
continuous curve G(t):[0,1)βTdβ(a) such that G(0)=G0β and such that
Ξ»(SβSG(t)β)βΊΞ»(SβS0β) with strict majorization for tβ(0,Ξ΅) for some Ξ΅>0.
Proof.
With the notation of the statement and Notation 3.4, notice that
[TABLE]
As in
Notation 3.4, consider B={vlβ}lβIdββ an ONB of Cd such that
[TABLE]
For tβ[0,1) we let
[TABLE]
Notice that, if lβJeβ, then
(SβS0β)glβ=ceβglββΉβ¨glβ,vjββ©=0.
Similarly, if lβIkββJeβ then β¨glβ,viββ©=0 (so that glβ(t)=glβ).
Therefore the sequence G(t)={glβ(t)}lβIkβββTdβ(a) for tβ[0,1).
Let Piβ=viββviβ and Pjiβ=vjββviβ (so that Pjiβx=β¨x,viββ©Β vjβ). Then,
for every tβ[0,1),
[TABLE]
That is, if V(t)=I+((1βt2)1/2β1)Β Piβ+tΒ PjiββMdβ(C) then glβ(t)=V(t)Β glβ for every lβIkβ and tβ[0,1).
Therefore, we get that
[TABLE]
Hence, we obtain the representation
[TABLE]
where the functions Ξ³rsβ(t) are the entries of
A(t)=\big{(}\,\gamma_{rs}(t)\,\big{)}_{r\,,\,s=1}^{2}\in\mathcal{H}(2)
defined by
[TABLE]
It is straightforward to check that tr(A(t))=ΞΌjβ+ΞΌiβ and that det(A(t))=(1βt2)ΞΌjβΞΌiβ.
These facts imply that
if we consider the continuous function L(t)=Ξ»maxβ(A(t)) then L(0)=ΞΌjβ and L(t) is strictly
increasing in [0,1).
More straightforward computations show that we can consider continuous curves xiβ(t):[0,1)βC2
which satisfy that {x1β(t),x2β(t)} is ONB of C2 such that
[TABLE]
For tβ[0,1) we let X(t)=(ur,sβ(t))r,s=12ββU(2) with columns x1β(t) and x2β(t).
By construction, X(t)=[0,1)βU(2) is a continuous curve such that X(0)=I2β and such that
[TABLE]
Finally, consider the continuous curve U(t):[0,1)βU(d) given by
[TABLE]
Notice that U(0)=I; also, let G~β(t)=U(t)βG(t)βTdβ(a) for tβ[0,1), which is a
continuous curve such that G~β(0)=G0β.
In this case, for tβ[0,1) we have that
[TABLE]
In other words, U(t) is constructed in such a way that B={vlβ}iβIdββ consists of eigenvectors of SG~β(t)β for
every tβ[0,1). Hence, if E(t)=L(t)βΞΌjββ₯0 for tβ[0,1), we get that
[TABLE]
Let Ξ΅>0 be such that
E(t)=L(t)βΞΌjββ€2crββceββ for tβ[0,Ξ΅]. (recall that
L(0)=ΞΌjβ and that ceβ<crβ). Since L(t) (and hence E(t)) is strictly
increasing in [0,1), we see that
[TABLE]
where the majorization relations above are strict.
β
Proof of Proposition 3.5.
Fix SβMdβ(C)+, a=(aiβ)iβIkβββ(R>0kβ)β and a strictly convex u.i.n. N on Mdβ(C).
Consider G0β a local minimizer of Ξ(N,S,a)β in Tdβ(a). Then, G0β satisfies
the assumptions in Notation 3.4; with this notation, assume that p>1.
Then, we show that there exist 0=s0β<s1β<β¦<spβ1β<spβ=rkS0ββ€d such that
[TABLE]
Indeed, in case the sets Jjβ for jβIpβ do not have the structure described above (i.e.
increasing sets formed by consecutive indexes) then, we get that there exist indexes i,rβIpβ and h,lβIkβ for which
Eq. (30) holds. In this case, Proposition 5.1 shows that there exists a continuous curve G(t):[0,1)βTdβ(a) such that G(0)=G0β and such that
Ξ»(SβSG(t)β)βΊΞ»(SβS0β) with strict majorization for tβ(0,Ξ΅) for some Ξ΅>0.
Since N is a strictly convex u.i.n. we conclude that
[TABLE]
This last fact contradicts the local minimality of G0β. Hence, there exist indexes
s0β=0<s1β<β¦<spβ1β<spββ€d for which the representation of the sets Jjβ for jβIpβ as in
Eq. (34) holds.
Similarly, in case Kjβ for jβIpβ are not increasing sets formed by consecutive indexes then, using Proposition
5.2, we also get that G0β is not a local minimizer; this last fact contradicts the hypothesis on G0β.
Finally, notice that by Theorem 3.2 we have that
the family {giβ}iβJjββ is linearly independent for every jβIpβ1β.
In particular, by Eq. (29), we get that dim(Wjβ)=β£Kjββ£=β£Jjββ£Β forΒ jβIpβ1β. Hence,
we get that Jjβ=Kjβ for jβIpβ1β and that Kpβ={spβ1β+1,β¦,spβ} and the result follows.
β
In what follows, we show Theorem 3.8. First, we consider a preliminary result.
Proposition 5.3**.**
Consider Notation 3.7 and 3.4, and assume that p>1. Assume further that the sets Jjβ and Kjβ, for jβIpβ, satisfy Eq.
(34) above. Then,
-
We have that (aiβ)iβJjβββΊ(Ξ»iββcjβ)iβKjββ, for jβIpβ.
2. 2.
If 0β€r<sβ€d then, (ajβ)j=r+1sββΊ(Ξ»jββPr+1,sβ)j=r+1sβ if and only if
[TABLE]
Proof.
For each jβIpβ, consider Wjβ=span{giβ:Β iβJjβ}=R(SGjββ), so that dimWjβ=β£Kjββ£ and let Qjβ be the orthogonal projection onto Wjβ;
then, Wjβ reduces both S, S0β and notice that (SβS0β)Qjβ=cjβQjβ and S0βQjβ=SGjββ. Then,
[TABLE]
Hence, by the Schur-Horn theorem we get that (aiβ)iβJjβββΊΞ»(SGjββ) which is equivalent to the
majorization relation (aiβ)iβIjβββΊ(Ξ»iββcjβ)iβKjββ, and item 1 follows.
Let 0β€r<sβ€d and notice that by construction (ajβ)j=r+1sβ,(Ξ»jββPr+1,sβ)j=r+1sββ(Rsβr)β.
On the other hand, if r+1β€iβ€s then
[TABLE]
This last fact shows item 2.
β
Proof of Theorem 3.8.
In case G0β is a local minimizer of Ξ(N,S,a)β on Tdβ(a) for a strictly convex u.i.n., then
the previous results imply that the sets Jjβ and Kjβ associated with G0β satisfy Eq.
(34). Hence, we show that the following relations hold:
-
The index s_{1}=\max\,\big{\{}j\leq s_{p-1}\,:\,P_{1\,,\,j}=\min\limits_{i\leq s_{p-1}}\,P_{1\,,\,i}\,\big{\}}, and
c1β=P1,s1ββ.
2. 2.
Recursively, if sjβ<spβ1β, then
[TABLE]
Indeed, consider an arbitrary 0β€jβ€pβ2. By item 1. in Proposition 5.3 and the fact that Jj+1β=Kj+1β={sjβ+1,β¦,sj+1β} then we see that
[TABLE]
Now, using the majorization relation in Eq. (36) an item 2 in Proposition 5.3 we also get that
[TABLE]
Therefore, in case the relations between the indexes s0β=0<β¦<spβ1β and the constants c1β<β¦<cpβ1β in the statement do not hold, we get that there
exists 0β€jβ€pβ2 such that
[TABLE]
By definition of t we get that
[TABLE]
Also, there exists
j+1β€ββ€pβ2 such that sββ<tβ€sβ+1β. Using the majorization relation in Eq. (36)
we see that for jβ€rβ€ββ1:
[TABLE]
Then, the previous inequalities allow us to bound
[TABLE]
that represents the lower bound Ξ² as a convex combination of the constants cj+1β<β¦<cβ+1β.
This last fact clearly implies that Psjβ+1,tββ₯Ξ²>cj+1β, that contradicts Eq. (37).
β