This paper investigates the optimal error bounds for the initial step in cyclic alternating projections in Hilbert spaces, providing explicit formulas and bounds for the associated functions that measure convergence rates.
Contribution
It introduces a novel approach linking the problem to Hermitian matrix optimization, deriving explicit formulas for three subspaces, and establishing bounds for general cases.
Findings
01
Explicit formula for f_3(c) derived.
02
Bounds for f_n(c) established for all n ≥ 4.
03
Connection made between error bounds and Hermitian matrix optimization.
Abstract
Let H be a Hilbert space and H1,...,Hn be closed subspaces of H. Set H0:=H1∩H2∩...∩Hn and let Pk be the orthogonal projection onto Hk, k=0,1,...,n. The paper is devoted to the study of functions fn:[0,1]→R defined by fn(c)=sup{∥Pn...P2P1−P0∥∣cF(H1,...,Hn)⩽c},c∈[0,1], where the supremum is taken over all systems of subspaces H1,...,Hn for which the Friedrichs number cF(H1,...,Hn) is less than or equal to c. Using the functions fn one can easily get an upper bound for the rate of convergence in the method of cyclic alternating projections. We will show that the problem of finding fn(c) is equivalent to a certain optimization problem on a subset of the set of Hermitian complex n×n matrices. Using the equivalence we find f3 and study properties of fn, n⩾4. Moreover, we…
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Spectral Theory in Mathematical Physics · Holomorphic and Operator Theory
Full text
On the optimal error bound for the first step in the method
of cyclic alternating projections
Ivan Feshchenko
Taras Shevchenko National University of Kyiv,
Faculty of Mechanics and Mathematics, Kyiv, Ukraine and
Samsung R&D Institute Ukraine, 57 L’va Tolstogo str., Kiev 01032, Ukraine
Let H be a Hilbert space and H1,...,Hn be closed subspaces of H.
Set H0:=H1∩H2∩...∩Hn and let
Pk be the orthogonal projection onto Hk, k=0,1,...,n.
The paper is devoted to the study of functions fn:[0,1]→R
defined by
[TABLE]
where the supremum is taken over all systems of subspaces H1,...,Hn
for which the Friedrichs number cF(H1,...,Hn) is less than or equal to c.
Using the functions fn one can easily get an upper bound for the rate of convergence
in the method of cyclic alternating projections.
We will show that the problem of finding fn(c) is equivalent to
a certain optimization problem on a subset of the set of
Hermitian complex n×n matrices.
Using the equivalence we find f3 and study properties of fn, n⩾4.
Moreover, we show that
[TABLE]
for all c∈[0,1],
where an=2(n−1)sin2(π/(2n)), bn=6(n−1)2sin4(π/(2n))
and bn is some positive number.
Key words and phrases:
Hilbert space, system of subspaces, orthogonal projection,
Friedrichs number.
2010 Mathematics Subject Classification:
46C07, 47B15.
1. Introduction
1.1. The Friedrichs number of a pair of subspaces and
the method of alternating projections for two subspaces
Let H be a complex Hilbert space and H1,H2 be two closed subspaces of H.
The number cF(H1,H2) defined by
[TABLE]
is called the Friedrichs number (more precisely, the cosine of the Friedrichs angle)
of subspaces H1,H2.
Why is cF important?
A few properties of a pair H1,H2 can be formulated in terms of the Friedrichs number,
for example
(1)
the orthogonal projections onto H1 and H2 commute if and only if cF(H1,H2)=0;
2. (2)
the sum H1+H2 is closed if and only if cF(H1,H2)<1,
see, e.g., [6].
Also, the Friedrichs number is closely related to the rate of convergence in the
method of alternating projections.
This is a well-known method of finding the orthogonal projection of a given element x∈H
onto the intersection H1∩H2 when the orthogonal projections P1 and P2
onto H1 and H2 are assumed to be known.
Define the sequence x0:=x, x1:=P1x0, x2:=P2x1,
x3:=P1x2, x4:=P2x3 and so on.
Back in 1933 von Neumann [10]
proved that xk→P0x as k→∞,
where P0 is the orthogonal projection onto H1∩H2.
What can be said about the rate of convergence?
Since x2k=(P2P1)kx, we see that
[TABLE]
With respect to the orthogonal decomposition
H=(H1∩H2)⊕(H⊖(H1∩H2)) we have
P1=I⊕P1′, P2=I⊕P2′ and P0=I⊕0,
where I is the identity operator and P1′,P2′ are orthogonal projections.
Hence
[TABLE]
and
[TABLE]
But ∥P2P1−P0∥=cF(H1,H2) (see, e.g., [6]) and therefore we get estimate
[TABLE]
This estimate is not sharp.
Aronszajn [1] proved that
[TABLE]
Therefore we get
[TABLE]
It is worth mentioning that this estimate is sharp because Kayalar and Weinert [8] proved that
[TABLE]
1.2. The method of cyclic alternating projections for n subspaces
Let H be a complex Hilbert space and H1,...,Hn be closed subspaces of H.
The method of cyclic alternating projections is a well-known method
of finding the orthogonal projection of a given element x∈H
onto the intersection H1∩H2∩...∩Hn
when the orthogonal projections Pi onto Hi, i=1,2,...,n are assumed to be known.
The method plays an important role in many areas of mathematics, see, e.g., [5].
Define the sequence
[TABLE]
and after this
[TABLE]
and so on.
Back in 1962 Halperin [7] proved that xk→P0x as k→∞,
where P0 is the orthogonal projection onto the intersection H1∩H2∩...∩Hn.
A simple and elegant proof of the result can be found in [9].
In particular, the subsequence xnk=(Pn...P2P1)kx→P0x as k→∞.
What can be said about the rate of convergence of {xnk∣k⩾1} to P0x?
To answer this question Badea, Grivaux and Müller
in [2], [3] introduced the Friedrichs number of n subspaces,
cF(H1,...,Hn).
1.3. The Friedrichs number of n subspaces
Badea, Grivaux and Müller noticed that for two subspaces H1,H2
[TABLE]
and defined
[TABLE]
Since this definition seems to be rather difficult, we will present a more simple formula for cF.
But first we define the Dixmier number of n subspaces, cD(H1,...,Hn).
Following [3], set
[TABLE]
It is clear that
[TABLE]
where H0=H1∩H2∩...∩Hn.
The Dixmier number of n subspaces is closely related to the sum of the corresponding orthogonal projections.
Proposition 1.1**.**
The following equality holds:
[TABLE]
As a corollary, we see that
[TABLE]
and consequently
[TABLE]
This equality is not new, see [3, Proposition 3.7].
1.4. The rate of convergence in the method of cyclic alternating projections
Let us return to the question on the rate of convergence in the method of cyclic alternating projections.
In [3] Badea, Grivaux and Müller showed that
(1)
if cF(H1,...,Hn)<1, i.e., if the angle between H1,...,Hn is positive, then
[TABLE]
for some q=q(cF(H1,...,Hn))∈[0,1).
The inequality means that the sequence of operators (Pn...P2P1)k converges “quickly” to P0 as k→∞.
2. (2)
if cF(H1,...,Hn)=1, i.e., if the angle between H1,...,Hn equals zero, then
[TABLE]
Moreover, the sequence of operators (Pn...P2P1)k converges strongly to P0 as k→∞
and we have “arbitrarily slow” convergence of (Pn...P2P1)k to P0
(see [3]).
For more complete picture of the quick uniform convergence/arbitrarily slow convergence dichotomy
see [3] and [4].
1.5. What this paper is about.
Let H be a complex Hilbert space and H1,...,Hn be closed subspaces of H.
Denote by Pi the orthogonal projection onto Hi, i=1,...,n.
Set H0:=H1∩H2∩...∩Hn.
Denote by P0 the orthogonal projection onto H0.
This paper is devoted to the study of functions fn:[0,1]→R, n⩾2,
defined by
[TABLE]
The supremum is taken over all systems of subspaces H1,...,Hn with
cF(H1,...,Hn)⩽c, where c∈[0,1] is a given number.
Remark 1.1**.**
The reader may wonder why we do not write cF(H1,...,Hn)=c.
Answer: we believe that the assumption cF(H1,...,Hn)⩽c is more convenient for applications.
Indeed, finding the exact value of cF(H1,...,Hn) is usually much more difficult than obtaining the
inequality cF(H1,...,Hn)⩽c.
1.6. An equivalent problem
Let us present a problem which is equivalent to the problem of finding fn(c).
The fact that these problems are equivalent will be used in the sequel.
Proposition 1.2**.**
For every c∈[0,1]
[TABLE]
where the supremum is taken over all systems of subspaces H1,...,Hn with
cD(H1,...,Hn)⩽c.
Using the functions fn one can easily estimate the rate of convergence in the method of cyclic alternating projections.
Indeed, we have
[TABLE]
With respect to the orthogonal decomposition H=H0⊕(H⊖H0) we have
Pi=I⊕Pi′, i=1,2,...,n and P0=I⊕0.
Hence
[TABLE]
and
[TABLE]
where cF(H1,...,Hn)⩽c.
1.8. Notation
Throughout this paper H is a complex Hilbert space.
The inner product in H is denoted by ⟨⋅,⋅⟩ and ∥⋅∥
stands for the corresponding norm, ∥x∥=⟨x,x⟩.
The identity operator on H is denoted by I
(throughout the paper it is clear which Hilbert space is being considered).
All vectors are vector-columns; the letter ”t” means transpose.
2. Results and Questions
Our Main Problem is the following: find fn(c),c∈[0,1] for n⩾2.
It is trivial that f2(c)=c,c∈[0,1]
(this follows from the equality ∥P1P2−P0∥=cF(H1,H2)).
But what about fn, n⩾3?
Or, at least, what about f3?
2.1. The functions fn and an optimization problem
We will show that our Main Problem is equivalent to
a certain optimization problem on a subset of the set of
Hermitian complex n×n matrices.
For two Hermitian n×n matrices A,B
we will write A⩽B if
⟨Ax,x⟩⩽⟨Bx,x⟩ for every
x∈Cn, where ⟨⋅,⋅⟩ is the standard
inner product in the space Cn.
Equivalently, A⩽B if the matrix B−A is positive semidefinite.
Theorem 2.1**.**
The following equality holds:
[TABLE]
where the maximum is taken over all Hermitian
complex matrices A=(aij∣i,j=1,...,n)
such that aii=1, i=1,...,n and 0⩽A⩽(1+(n−1)c)I.
Now it’s time for some notation.
For an n×n matrix A set
[TABLE]
For a real number t⩾1 denote by Hn(t) the set of all
Hermitian matrices A=(aij∣i,j=1,...,n) such that
aii=1 for i=1,2,...,n and 0⩽A⩽tI.
Then Theorem 2.1 says that
[TABLE]
The following natural problem arises.
Problem 1:
find an optimal matrix A for the optimization problem above, i.e.,
a matrix A∈Hn(1+(n−1)c) such that Π(A)=fn(c).
Or, at least, find 1-diagonal (a12,a23,...,an−1,n) of an optimal matrix.
It is natural to try to reduce the set of matrices
on which the function Π is considered.
To this end we will use the following lemma.
Lemma 2.1**.**
Let t⩾1.
For arbitrary matrix A∈Hn(t)
there exists a matrix B∈Hn(t) such that
(1)
bij∈R* for all i,j=1,2,...,n and
bi,i+1⩾0 for i=1,2,...,n−1;*
2. (2)
bi,j=bn+1−i,n+1−j* for all i,j=1,2,...,n;*
3. (3)
Π(B)⩾Π(A).
Denote by Hn′(t) the set of all matrices A∈Hn(t)
such that
(1)
aij∈R for all i,j=1,...,n and
ai,i+1⩾0 for all i=1,2,...,n−1;
2. (2)
aij=an+1−i,n+1−j for all i,j=1,2,...,n.
For example, matrices from H3′(t) have the form
[TABLE]
where x⩾0 and y∈R;
matrices from H4′(t) have the form
Problem 1′**: find an optimal matrix A∈Hn′(1+(n−1)c)
for the optimization problem above.
Or, at least, find a 1-diagonal (a12,a23,...,an−1,n)
of an optimal matrix A∈Hn′(1+(n−1)c).
Remark 2.1**.**
It is worth mentioning that there exists a unique optimal 1-diagonal
(a12,a23,...,an−1,n)
(i.e., 1-diagonal of an optimal matrix) for which ai,i+1⩾0, i=1,2,...,n−1.
Indeed, assume that A,B∈Hn(1+(n−1)c) are optimal
and ai,i+1⩾0, bi,i+1⩾0 for i=1,2,...,n−1.
We claim that ai,i+1=bi,i+1, i=1,2,...,n−1.
If c=0, then Hn(1)={I} and our assertion is clear.
Assume that c>0.
Then ai,i+1>0 and bi,i+1>0 for i=1,2,...,n−1.
Consider the matrix C:=1/2(A+B).
It is clear that C∈Hn(1+(n−1)c).
Since ci,i+1=1/2(ai,i+1+bi,i+1)⩾ai,i+1bi,i+1, we conclude that
Π(C)⩾Π(A)Π(B)=fn(c).
If follows that Π(C)=fn(c) and consequently ai,i+1=bi,i+1 for all i=1,2,...,n−1.
2.2. The function f3
Now we are ready to find f3.
Theorem 2.2**.**
We have
[TABLE]
For c∈[0,1/4] the matrix
[TABLE]
is optimal, for c∈[1/4,1] the matrix
[TABLE]
is optimal.
2.3. On the functions fn with n⩾4
For n=4, to find f4(c) one have to consider matrices of the form
[TABLE]
where x⩾0, y⩾0, w,z∈R,
and have to maximize x2y.
We could not find f4 (and fn for n⩾4).
Nevertheless we have the following theorem.
Theorem 2.3**.**
Let n⩾2 and c∈[0,1/(n−1)2].
Then fn(c)=(n−1)n−1cn−1 and the matrix A∈Hn′(1+(n−1)c) defined by
[TABLE]
is optimal.
Although we could not find fn for n⩾4,
we know some properties of the function.
Firstly, note that fn is non-decreasing on [0,1]
(it follows directly from the definition of fn).
Theorem 2.4**.**
The function fn1/(n−1) is concave on [0,1].
Corollary 2.1**.**
The function fn is continuous on [0,1].
Theorem 2.5**.**
The function fn satisfies the following functional equation:
[TABLE]
for c∈[1/(n−1)2,1].
Regarding Problem 1*′*, we have the following criterion for a matrix to be optimal.
Proposition 2.1**.**
Let c>0 and a matrix A∈Hn(1+(n−1)c)
be such that ai,i+1>0 for i=1,2,...,n.
Then A is optimal, i.e., Π(A)=fn(c) if and only if
[TABLE]
for arbitrary matrix B∈Hn(1+(n−1)c)
with bi,i+1⩾0 for i=1,2,...,n−1.
2.4. Bounds for fn(c): known upper bounds
In this subsection we present known upper bounds for fn(c).
The upper bounds are of great interest because using them
one can easily estimate the rate of convergence in the method of cyclic alternating projections
(see subsection 1.7).
Let H be a Hilbert space and H1,...,Hn be closed subspaces of H.
Denote by Pi the orthogonal projection onto Hi, i=1,...,n.
Set H0:=H1∩H2∩...∩Hn.
Denote by P0 the orthogonal projection onto H0.
Set cF:=cF(H1,...,Hn).
Let c∈[0,1].
In what follows we assume that cF⩽c.
By using Theorem 2.6 one can get a more simple estimate for fn(c)
(a more simple than the estimate given by Theorem 2.6).
One can easily check that a/b⩽(a+x)/(b+x),
where 0⩽a⩽b and x⩾0.
Setting a=n−4(n−1)(sin2(π/(2n)))(1−c),
b=n+4(n−1)2(sin2(π/(2n)))(1−c) and
x=4(n−1)(sin2(π/(2n)))(1−c), we get
[TABLE]
Using Taylor’s theorem with the Lagrange form of the remainder one can easily check that
[TABLE]
for u⩾0.
Thus
[TABLE]
where an=2(n−1)sin2(π/(2n)) and bn=6(n−1)2sin4(π/(2n)).
Note that a2=1 and a3=1.
Question 1.
Is it true that fn(c)⩽1−an(1−c) for all c∈[0,1]?
Or, at least, for all c which are sufficiently close to 1?
2.6. Bounds for fn: lower bounds
Theorem 2.7**.**
For every n⩾2 there exists a positive constant bn such that
[TABLE]
for all c∈[0,1].
Consequently, we have
[TABLE]
for all c∈[0,1].
These inequalities mean that the estimate for fn(c)
given by Theorem 2.6 is optimal for c≈1,
up to O((1−c)2), c→1−.
We have to prove that fn(c)=gn(c) for every c∈[0,1].
First, we will show that fn(c)⩽gn(c), c∈[0,1].
Consider arbitrary system of subspaces H1,...,Hn of a Hilbert space H such that
cF(H1,...,Hn)⩽c.
Set H0:=H1∩...∩Hn and denote by P0 the orthogonal projection onto H0.
Let us prove that ∥Pn...P2P1−P0∥⩽gn(c).
To this end consider the orthogonal decomposition H=H0⊕(H⊖H0)=:H0⊕H′.
With respect to this orthogonal decomposition Hi=H0⊕(Hi⊖H0)=:H0⊕Hi′,
i=1,2,...,n.
Thus
[TABLE]
Therefore cD(H1′,...,Hn′)⩽c.
Further, with respect to the orthogonal decomposition H=H0⊕H′ we have
Pi=I⊕Pi′, where Pi′ is the orthogonal projection onto Hi′ in H′,
i=1,2,...,n, and P0=I⊕0.
Thus Pn...P2P1−P0=0⊕Pn′...P2′P1′ whence
[TABLE]
because cD(H1′,...,Hn′)⩽c.
It follows that fn(c)⩽gn(c).
Now we will show that gn(c)⩽fn(c).
Let us prove this inequality for c∈[0,1).
Consider arbitrary system of subspaces H1,...,Hn of a Hilbert space H such that
cD(H1,...,Hn)⩽c.
Let us prove that ∥Pn...P2P1∥⩽fn(c).
Since cD(H1,...,Hn)<1, we conclude that H1∩...∩Hn={0}.
Indeed, assume that H1∩...∩Hn={0}.
Take a vector u∈H1∩...∩Hn, u=0 and set xi=u, i=1,2,...,n.
Then
[TABLE]
whence cD(H1,...,Hn)=1, contradiction.
Therefore H1∩...∩Hn={0}.
Thus cF(H1,...,Hn)=cD(H1,...,Hn)⩽c and P0=0.
Hence
[TABLE]
It follows that gn(c)⩽fn(c).
Let us show that gn(1)⩽fn(1).
It is clear that
[TABLE]
(just take Hi=H, i=1,2,...,n, then ∥Pn...P2P1∥=∥I∥=1).
So we have to show that fn(1)⩾1.
To this end we will show that the number ∥Pn...P2P1−P0∥ can be arbitrarily close to 1.
Let H=C2 be the two-dimensional Hilbert space.
For an angle φ∈(0,π/2] define two subspaces
[TABLE]
and
[TABLE]
Then M∩N={0} and for the orthogonal projections PM and PN onto the subspaces M and N, respectively,
we have ∥PNPM∥=∥PN(cosφ,sinφ)t∥=cosφ.
Thus for a system of n subspaces H1=M, Hi=N, i=2,3,...,n we have
[TABLE]
can be arbitrarily close to 1.
Therefore fn(1)⩾1.
So, we proved that fn(c)⩽gn(c) and gn(c)⩽fn(c).
It follows that fn(c)=gn(c), c∈[0,1].
First, note the maximum
max{∣a12a23...an−1,n∣∣A∈Hn(1+(n−1)c)} exists, i.e., is attained.
This is a direct consequence of the following two facts:
the function A↦∣a12a23...an−1,n∣ is continuous and the set Hn(1+(n−1)c)
is compact.
Let us show that
[TABLE]
To this end we consider arbitrary matrix A∈Hn(1+(n−1)c).
We have to show that fn(c)⩾∣a12a23...an−1,n∣.
Since A is Hermitian and positive semidefinite,
we conclude that A=B∗B for some n×n matrix B.
Let v1,...,vn be the columns of B, i.e., B=(v1v2...vn).
We have aij=∑k=1nbkibkj=⟨vj,vi⟩.
This means that A is the Gram matrix of the vectors v1,...,vn.
Since aii=1, we see that ∥vi∥=1, i=1,2,...,n.
Consider the system of one dimensional subspaces Hi={avi∣a∈C}, i=1,2,...,n.
We claim that cD(H1,...,Hn)⩽c and ∥Pn...P1∥=∣a12a23...an−1,n∣.
It will follow that fn(c)⩾∣a12a23...an−1,n∣.
First consider
[TABLE]
Since Pix=⟨x,vi⟩vi, x∈Cn, one can easily check that
[TABLE]
It follows that
[TABLE]
Thus ∥Pn...P1∥=∣a12a23...an−1,n∣.
Let us show that cD(H1,...,Hn)⩽c.
For arbitrary vectors x1=a1v1,...,xn=anvn we have
[TABLE]
It follows that ∑i=j⟨xj,xi⟩⩽(n−1)c∑i=1n∥xi∥2.
Therefore
[TABLE]
Let us show that
[TABLE]
Define K:=max{∣a12a23...an−1,n∣∣A∈Hn(1+(n−1)c)}
and consider arbitrary system of subspaces H1,...,Hn of a Hilbert space H such that cD(H1,...,Hn)⩽c.
We have to prove that ∥Pn...P2P1∥⩽K.
Let v1∈H1,...,vn∈Hn be arbitrary elements with ∥vi∥=1, i=1,...,n.
Denote by G the Gram matrix of these elements, i.e., G=(gij=⟨vj,vi⟩∣i,j=1,...,n).
We claim that G∈Hn(1+(n−1)c).
Indeed, it is clear that G∗=G⩾0 and gii=∥vi∥2=1, i=1,...,n.
It remains to show that G⩽(1+(n−1)c)I.
For arbitrary scalars a1,...,an we have
[TABLE]
It follows that G⩽(1+(n−1)c)I.
(It is worth mentioning that this follows also from [3, Proposition 3.4] formulated for
the nonreduced configuration constant and [3, Proposition 3.6(f)].)
Since G∈Hn(1+(n−1)c), we conclude that ∣g12g23...gn−1,n∣⩽K, i.e.,
∣⟨v1,v2⟩⟨v2,v3⟩...⟨vn−1,vn⟩∣⩽K.
It follows that for arbitrary elements u1∈H1,...,un∈Hn we have
[TABLE]
Now consider arbitrary x∈H and set ui:=PiPi−1...P1x, i=1,...,n.
Then
First, note that the set Hn(t) has the following properties:
(1)
if A∈Hn(t) and U is a diagonal unitary matrix, i.e.,
U=diag(u1,...,un), where u1,...,un are scalars with ∣ui∣=1, i=1,2,...,n,
then U∗AU∈Hn(t);
2. (2)
if A∈Hn(t), then A⊤∈Hn(t).
Here (A⊤)ij=aji, i,j=1,2,...,n;
3. (3)
if A∈Hn(t), then A∈Hn(t).
Here (A)ij=an+1−i,n+1−j, i,j=1,2,...,n.
4. (4)
the set Hn(t) is convex.
Now we are ready to prove the needed assertion.
Let A∈Hn(t).
For a diagonal unitary matrix U=diag(u1,u2,...,un) define B:=U∗AU.
Then B∈Hn(t).
Moreover, since bi,i+1=ai,i+1uiui+1
one can choose scalars u1,...,un so that bi,i+1=∣ai,i+1∣
for i=1,2,...,n−1.
Then Π(B)=Π(A).
Further, consider the matrix B⊤ and set C:=1/2(B+B⊤).
Then C∈Hn(t).
We have cij=1/2(bij+bji)=Re(bij)∈R
and ci,i+1=bi,i+1⩾0.
Therefore Π(C)=Π(B).
Finally, consider the matrix C and set D:=1/2(C+C).
Then D∈Hn(t).
The matrix D has the following properties:
(1)
dij=1/2(cij+cn+1−i,n+1−j)∈R for all i,j and
di,i+1=1/2(ci,i+1+cn+1−i,n−i)=1/2(ci,i+1+cn−i,n−i+1)⩾0 for i=1,2,...,n−1;
2. (2)
dn+1−i,n+1−j=dij for all i,j=1,2,...,n;
3. (3)
since di,i+1=1/2(ci,i+1+cn−i,n−i+1)⩾ci,i+1cn−i,n−i+1 for i=1,2,...,n−1,
we conclude that
To find f3(c) one can consider matrices of the form
[TABLE]
where x⩾0 and y∈R.
We have to maximize x2 under the condition
0⩽A⩽(1+2c)I.
Consider the condition A⩾0.
It is well-known that a Hermitian matrix is positive semidefinite if and only if
every principal minor of the matrix (including its determinant) is nonnegative.
(Recall that a principal minor is the determinant of a principal submatrix;
a principal submatrix is a square submatrix obtained by removing certain
rows and columns with the same index sets.)
Using this criterion one can easily check that A⩾0 if and only if
[TABLE]
Consider the condition A⩽(1+2c)I⇔(1+2c)I−A⩾0.
Now one can easily check that A⩽(1+2c)I if and only if
[TABLE]
Hence, 0⩽A⩽(1+2c)I if and only if
[TABLE]
We have to maximize x2 under these conditions.
Define two linear functions φ(y)=(1+y)/2 and ψ(y)=c(2c−y).
It is clear that φ is increasing and ψ is nonincreasing.
Consider the equation φ(y)=ψ(y).
The unique solution is y=2c−1.
Therefore
[TABLE]
This minimum attains its maximum value c at the point y=2c−1.
Thus x2⩽c and x⩽c.
Let us check for which c∈[0,1] the values x=c and y=2c−1 are permissible.
First consider the inequality ∣y∣⩽min{1,2c}.
It is clear that −1⩽2c−1⩽1 and 2c−1⩽2c.
However, the inequality 2c−1⩾−2c holds only for c⩾1/4.
For such c we have c⩽1 and c⩽2c.
Conclusion: for c∈[1/4,1] the optimal values x=c, y=2c−1, the optimal matrix is equal to
[TABLE]
and f3(c)=c.
Consider the case c∈[0,1/4).
Then 2c−1<−2c and hence the conditions for x and y can be rewritten as
[TABLE]
Now it is easy to see that the optimal values of x and y are x=2c and y=−2c.
Therefore the optimal matrix is equal to
Consider an arbitrary matrix A∈Hn(1+(n−1)c).
Since A⩽(1+(n−1)c)I, we conclude that the matrix (1+(n−1)c)I−A
is positive semidefinite.
It follows that the determinant of every 2×2 submatrix
[TABLE]
is nonnegative, i.e., (n−1)2c2−∣aij∣2⩾0, ∣aij∣⩽(n−1)c.
Therefore ∣a12a23...an−1,n∣⩽(n−1)n−1cn−1.
On the other hand, consider the matrix J where each entry is equal to 1, i.e.,
[TABLE]
It is easily seen that J is positive semidefinite and the largest eigenvalue of J equals n.
Thus 0⩽J⩽nI,
−nI⩽−J⩽0,
−(n−1)I⩽I−J⩽I,
−(n−1)2cI⩽(n−1)c(I−J)⩽(n−1)cI and
[TABLE]
Set M:=I+(n−1)c(I−J).
Since c∈[0,1/(n−1)2] and
[TABLE]
we see that M∈Hn(1+(n−1)c) and Π(M)=(n−1)n−1cn−1.
Therefore fn(c)=(n−1)n−1cn−1.
Finally, define U:=diag(−1,1,−1,1,...) and consider the matrix A:=U∗MU.
Since
[TABLE]
we conclude that A∈Hn′(1+(n−1)c) and Π(A)=(n−1)n−1cn−1.
Thus A is optimal.
Let c1,c2∈[0,1] and λ∈(0,1).
We have to show that
[TABLE]
Let A∈Hn(1+(n−1)c1) be such that ai,i+1⩾0 for i=1,2,...,n−1 and Π(A)=fn(c1).
Let B∈Hn(1+(n−1)c2) be such that bi,i+1⩾0 for i=1,2,...,n−1 and Π(B)=fn(c2).
Consider the matrix λA+(1−λ)B.
It is clear that λA+(1−λ)B∈Hn(1+(n−1)(λc1+(1−λ)c2)).
Thus
[TABLE]
whence
[TABLE]
Now we will use the inequality
[TABLE]
where m is a natural number and numbers s1,...,sm,t1,...,tm are nonnegative.
We have
Define the function gn:=fn1/(n−1).
Let us prove that gn is continuous on [0,1].
It will follow that fn=gnn−1 is also continuous on [0,1].
We will use the following well-known fact:
if a function φ:(a,b)→R is convex on (a,b),
then φ is continuous on (a,b).
Since gn is concave on [0,1] (by Theorem 2.4),
we conclude that gn is continuous on (0,1).
Theorem 2.3 implies that gn(c)=(n−1)c for c∈[0,1/(n−1)2].
Thus gn is continuous at the point [math].
Let us show that gn is continuous at the point 1.
We have gn(1)=(fn(1))1/(n−1)=1 (Proposition 1.2 implies
that fn(1)=1) and gn(0)=0.
Since gn is concave on [0,1], we conclude that
gn(c)⩾c for all c∈[0,1].
Since gn is non-decreasing on [0,1], we conclude that
gn(c)⩽1 for all c∈[0,1].
Thus c⩽gn(c)⩽1 for c∈[0,1].
It follows that limc→1−gn(c)=1=gn(1).
Therefore gn is continuous at the point 1.
We proved that the function gn is continuous at every point of the segment [0,1].
Thus gn is continuous on [0,1].
Fix c∈[1/(n−1)2,1].
Consider arbitrary matrix A∈Hn(1+(n−1)c).
Then
0⩽A⩽(1+(n−1)c)I,
0⩽(1+(n−1)c)I−A⩽(1+(n−1)c)I and
[TABLE]
Define
[TABLE]
then
bii=1, i=1,2,...,n and
bij=−aij/((n−1)c) for i=j.
It follows that B∈Hn(1+(n−1)/((n−1)2c)) and
Π(B)=Π(A)/((n−1)n−1cn−1).
Since the mapping A↦B from
Hn(1+(n−1)c) to Hn(1+(n−1)/((n−1)2c))
is one-to-one and onto, we conclude that
for arbitrary matrix B∈Hn(1+(n−1)c) with bi,i+1⩾0 for i=1,2,...,n−1.
Then
[TABLE]
It follows that
[TABLE]
and therefore fn(c)=Π(A).
Now assume that a matrix A is optimal, i.e., Π(A)=fn(c).
Consider arbitrary matrix B∈Hn(1+(n−1)c) with bi,i+1⩾0 for i=1,2,...,n−1.
For arbitrary number α∈[0,1] the matrix (1−α)A+αB belongs to Hn(1+(n−1)c).
Define the function
[TABLE]
Since A is optimal, we conclude that φ(α)⩽Π(A)=φ(0) for α∈[0,1].
It follows that φ′(0)⩽0, i.e.,
For arbitrary real numbers a1,...,an the following inequality holds:
[TABLE]
Proof.
Consider the inequality
[TABLE]
where D>0 and a1,...,an∈R.
We have to show that this inequality is valid for D=Dn and arbitrary a1,...,an∈R.
Inequality (3.3) does not change after substitution ai→ai+b, i=1,2,...,n, where b∈R.
Therefore without loss of generality we can and will assume that a1+...+an=0.
Then the left side of inequality (3.3) is equal to
[TABLE]
Thus inequality (3.3) is equivalent to the inequality
[TABLE]
which is equivalent to
[TABLE]
Define the matrix
[TABLE]
corresponding to the quadratic form ∑i=1n−1(ai−ai+1)2.
The matrix L is the Laplacian matrix of the graph Pn with vertices 1,2,...,n and
edges {1,2},{2,3},...,{n−1,n} (the path of length n−1).
Let λ1⩽λ2⩽...⩽λn be the spectrum of L.
It is clear that the eigenvalue λ1=0 (with a corresponding eigenvector (1,1,...,1)t)
and the multiplicity of λ1 is equal to 1.
Inequality (3.4) can be written as ⟨La,a⟩⩾(n/D)∥a∥2, where
a vector a=(a1,...,an)t is orthogonal to the vector (1,1,...,1)t.
Therefore this inequality will be valid if n/D=λ2, i.e., if D=n/λ2.
It is well-known that λ2=4sin2(π/(2n)).
Thus inequality (3.3) will be valid with D=n/(4sin2(π/(2n)))=Dn.
∎
Lemma 3.2**.**
For arbitrary vectors v1,...,vn∈H the following inequality holds:
[TABLE]
Proof.
Set a1:=0 and ai:=∥v1−v2∥+...+∥vi−1−vi∥ for i⩾2.
For i<j we have ai−aj=−(∥vi−vi+1∥+...+∥vj−1−vj∥).
Using Lemma 3.1 we get
[TABLE]
It follows that
[TABLE]
∎
Now we are ready to prove Theorem 2.6.
The proof of Theorem 2.6 is based on Proposition 1.2.
Let H be a complex Hilbert space and H1,...,Hn be closed subspaces of H.
Denote by Pi the orthogonal projection onto Hi, i=1,...,n.
Assume that cD(H1,...,Hn)⩽c.
We have to prove that
[TABLE]
By the definition of cD for arbitrary vectors x1∈H1,...,xn∈Hn we have
The proof of Theorem 2.7 is based on Proposition 1.2.
Consider the two-dimensional Hilbert space H=C2.
For a number α∈R let
L(α)={(cosα,sinα)tz∣z∈C}
be the one-dimensional subspace spanned by the vector (cosα,sinα)t.
Let α1,...,αn be real numbers such that for some
i and jαi=αj.
For each τ⩾0
consider the system of one-dimensional subspaces
Hk:=L(αkτ), k=1,...,n.
Let us find
[TABLE]
By Proposition 1.1 we have
∥P1+...+Pn∥=1+(n−1)c(τ), where Pk is the orthogonal projection
onto L(αkτ), k=1,2,...,n.
We have
[TABLE]
for k=1,2,...,n.
Therefore
[TABLE]
Let us find ∥P1+...+Pn∥=∥M(τ)∥.
Since the matrix M(τ) is Hermitian and positive semidefinite,
we conclude that ∥M(τ)∥ is equal to the largest eigenvalue of M(τ).
The characteristic polynomial of M(τ) is equal to
λ2−tr(M(τ))λ+det(M(τ)).
It is clear that trace of M(τ) is equal to n.
Consider
[TABLE]
Now we have the following equation for the eigenvalues of M(τ):
[TABLE]
The largest root is equal to (n+n2−4d(τ))/2.
Therefore
[TABLE]
Now we note a few properties of the functions c(τ) and d(τ):
(1) d(0)=0 and c(0)=1;
(2) the functions d and c are continuous on [0,+∞);
(3) there exists τ0=τ0(α1,...,αn)>0 such that
d is increasing on [0,τ0].
Consequently, c is decreasing on [0,τ0].
(4) Since sin2(ατ)=α2τ2+O(τ4) as τ→0+,
we conclude that
[TABLE]
where s1=s1(α1,...,αn)=∑i<j(αi−αj)2.
(5) Since 1+u=1+u/2+O(u2) as u→0, we conclude that
Now using Proposition 1.2,
(3.7), (3.9) and (3.8)
we get
[TABLE]
for τ∈(0,τ1], where
τ1=τ1(α1,...,αn)>0 and K=K(α1,...,αn).
Thus
[TABLE]
for all c∈[c(τ1),1].
Now we want to choose α1,...,αn for which the value of s2/s1
is as small as possible.
Consider s2/s1.
Since the value of s2/s1 does not change under substitution
αi→αi+a, i=1,2,...,n, a∈R,
we can and will assume that α1+...+αn=0.
This equality means that the vector
α=(α1,...,αn)t is orthogonal to
the vector e=(1,...,1)t.
For such α we have
[TABLE]
Also s2=⟨Lα,α⟩, where
[TABLE]
and ⟨⋅,⋅⟩ is the standard inner product in Rn.
Note that the matrix L is the Laplacian matrix of the graph
Pn with vertices 1,2,...,n and
edges {1,2},{2,3},...,{n−1,n} (the path of length n−1).
Let λ1⩽λ2⩽...⩽λn
be the spectrum of L.
It is clear that L is positive semidefinite and ker(L)
is the one-dimensional subspace spanned by the vector e.
Thus λ1=0 and λ2>0.
Note that λ2 is called the algebraic connectivity of
the graph Pn and is denoted by a(Pn).
It is well-known that
λ2=a(Pn)=4sin2(π/(2n)).
Now we return to the problem of minimizing the value of s2/s1.
We have
[TABLE]
The minimum value of
⟨Lα,α⟩/∥α∥2
under conditions ⟨α,e⟩=0, α=0
is equal to λ2
(and it is attained when α is
an eigenvector of L corresponding to the eigenvalue λ2).
So, let α be an eigenvector of L corresponding to the eigenvalue
λ2, then from (3.10) it follows that
[TABLE]
for all c∈[cn,1], where cn<1 and K=Kn.
By enlarging K, if necessary, we get the inequality
[TABLE]
for all c∈[0,1].
Bibliography10
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] N. Aronszajn, Theory of reproducing kernels , Trans. Amer. Math. Soc. 68 (1950) 337–404.
2[2] C. Badea, S. Grivaux, V. Müller, A generalization of the Friedrichs angle and the method of alternating projections , C. R. Math. Acad. Sci. Paris 348 (1-2) (2010) 53–56.
3[3] C. Badea, S. Grivaux, V. Müller, The rate of convergence in the method of alternating projections , Algebra i Analiz 23 (3) (2011) 1–30.
4[4] C. Badea, D. Seifert, Ritt operators and convergence in the method of alternating projections , J. Approx. Theory 205 (2016) 133–148.
5[5] F. Deutsch, The method of alternating orthogonal projections . In: S.P. Singh (eds.) Approximation Theory, Spline Functions and Applications, NATO ASI Series (Series C: Mathematical and Physical Sciences), vol. 356, Springer, Dordrecht, 1992, pp. 105–121.
6[6] F. Deutsch, The angle between subspaces of a Hilbert space . In: S.P. Singh (eds.) Approximation Theory, Wavelets and Applications, NATO Science Series (Series C: Mathematical and Physical Sciences), vol. 454, Springer, Dordrecht, 1995, pp. 107–130.
7[7] I. Halperin, The product of projection operators , Acta Sci. Math. (Szeged) 23 (1962) 96–99.
8[8] S. Kayalar, H. Weinert, Error bounds for the method of alternating projections , Math. Control Signals Systems 1 (1988) 43–59.