Orthogonality to matrix subspaces, and a distance formula

Priyanka Grover

arXiv:1705.07288·math.FA·May 23, 2017

Orthogonality to matrix subspaces, and a distance formula

Priyanka Grover

PDF

Open Access

TL;DR

This paper provides a precise condition for matrix orthogonality to subspaces and derives a formula for the distance from matrices to certain subalgebras, advancing understanding of matrix geometry.

Contribution

It introduces a necessary and sufficient condition for matrix orthogonality to subspaces and a new distance formula to unital C*-subalgebras.

Findings

01

Characterization of Birkhoff-James orthogonality to subspaces

02

Explicit distance formula to unital C*-subalgebras

03

Enhanced understanding of matrix subspace geometry

Abstract

We obtain a necessary and sufficient condition for a matrix $A$ to be Birkhoff-James orthogonal to any subspace $W$ of $M_{n} (C)$ . Using this we obtain an expression for the distance of $A$ from any unital $C^{*}$ subalgebra of $M_{n} (C)$ .

Equations110

∥ A ∥ = x \in C^{n}, ∥ x ∥ = 1 max ∥ A x ∥

∥ A ∥ = x \in C^{n}, ∥ x ∥ = 1 max ∥ A x ∥

∥ A + W ∥ \geq ∥ A ∥ for all W \in W .

∥ A + W ∥ \geq ∥ A ∥ for all W \in W .

dist (A, W) = min {∥ A - W ∥ : W \in W} .

dist (A, W) = min {∥ A - W ∥ : W \in W} .

dist (A, C I)^{2} = max {tr (A^{*} A P) - ∣ tr (A P) ∣^{2} : P \geq 0, tr P = 1} .

dist (A, C I)^{2} = max {tr (A^{*} A P) - ∣ tr (A P) ∣^{2} : P \geq 0, tr P = 1} .

Φ (A^{*} A) - Φ (A)^{*} Φ (A) \leq dist (A, C I)^{2} .

Φ (A^{*} A) - Φ (A)^{*} Φ (A) \leq dist (A, C I)^{2} .

C_{B} (B X) = B C_{B} (X) and C_{B} (X B) = C_{B} (X) B for all B \in B, X \in M_{n} (C) .

C_{B} (B X) = B C_{B} (X) and C_{B} (X B) = C_{B} (X) B for all B \in B, X \in M_{n} (C) .

\mathcal{C}(X)=\left[\begin{array}[]{cccc}X_{11}&&&\\ &X_{22}&&\\ &&\ddots&\\ &&&X_{kk}\end{array}\right].

\mathcal{C}(X)=\left[\begin{array}[]{cccc}X_{11}&&&\\ &X_{22}&&\\ &&\ddots&\\ &&&X_{kk}\end{array}\right].

dist (A, B)^{2} = max {tr (A^{*} A P - C_{B} (A P)^{*} C_{B} (A P) C_{B} (P)^{- 1}) : P \geq 0, tr P = 1},

dist (A, B)^{2} = max {tr (A^{*} A P - C_{B} (A P)^{*} C_{B} (A P) C_{B} (P)^{- 1}) : P \geq 0, tr P = 1},

f (y) - f (x) \geq Re v^{*} (y - x) for all y \in X .

f (y) - f (x) \geq Re v^{*} (y - x) for all y \in X .

\partial (g \circ L) (x) = S^{*} \partial g (L (x)),

\partial (g \circ L) (x) = S^{*} \partial g (L (x)),

⟨ S^{*} (y), x ⟩ = ⟨ y, S (x)⟩ for all x \in X and y \in Y .

⟨ S^{*} (y), x ⟩ = ⟨ y, S (x)⟩ for all x \in X and y \in Y .

\partial ∥ A ∥ = conv {u v^{*} : ∥ u ∥ = ∥ v ∥ = 1, A v = ∥ A ∥ u},

\partial ∥ A ∥ = conv {u v^{*} : ∥ u ∥ = ∥ v ∥ = 1, A v = ∥ A ∥ u},

∥ A + W ∥^{2}

∥ A + W ∥^{2}

∥ T ∥ = ∥ X ∥_{1} = 1 sup ∣ tr (T X) ∣,

∥ T ∥ = ∥ X ∥_{1} = 1 sup ∣ tr (T X) ∣,

∥ A + W ∥^{2}

∥ A + W ∥^{2}

∥ A + W ∥ \geq ∥ A ∥ for all W \in W .

∥ A + W ∥ \geq ∥ A ∥ for all W \in W .

L (W) = A + S (W) .

L (W) = A + S (W) .

(g \circ L) (W) \geq (g \circ L) (0),

(g \circ L) (W) \geq (g \circ L) (0),

0 \in S^{*} \partial ∥ A ∥.

0 \in S^{*} \partial ∥ A ∥.

S^{*} \partial ∥ A ∥ = conv {S^{*} (u v^{*}) : ∥u∥ = ∥v∥ = 1, Av = ∥A∥u} .

S^{*} \partial ∥ A ∥ = conv {S^{*} (u v^{*}) : ∥u∥ = ∥v∥ = 1, Av = ∥A∥u} .

S^{*} (\sum t_{i} u_{(i)} v_{(i)}^{*}) = 0.

S^{*} (\sum t_{i} u_{(i)} v_{(i)}^{*}) = 0.

A P

A P

A^{*} A P

A^{*} A P

(s_{j}^{2} - ∥ A ∥^{2}) p_{j} = 0 for all k + 1 \leq j \leq n .

(s_{j}^{2} - ∥ A ∥^{2}) p_{j} = 0 for all k + 1 \leq j \leq n .

dist (A, B) = W \in B min ∥ A - W ∥.

dist (A, B) = W \in B min ∥ A - W ∥.

dist (A, B) = dist (\tilde{A}, \oplus_{i} M_{n_{i}} (C)) .

dist (A, B) = dist (\tilde{A}, \oplus_{i} M_{n_{i}} (C)) .

max {tr (A^{*} A P - C_{B} (A P)^{*} C_{B} (A P) C_{B} (P)^{- 1}) : P \geq 0, tr P = 1}

max {tr (A^{*} A P - C_{B} (A P)^{*} C_{B} (A P) C_{B} (P)^{- 1}) : P \geq 0, tr P = 1}

= max {tr (\tilde{A}^{*} \tilde{A} \tilde{P} - C (\tilde{A} \tilde{P})^{*} C (\tilde{A} \tilde{P}) C (\tilde{P})^{- 1}) : \tilde{P} \geq 0, tr \tilde{P} = 1},

tr (X Y) = tr (Y X),

tr (X Y) = tr (Y X),

tr (A^{*} A P - C (A P)^{*} C (A P) C (P)^{- 1}) = tr (V^{*} A^{*} A P V - V^{*} C (A P)^{*} C (A P) C (P)^{- 1} V) .

tr (A^{*} A P - C (A P)^{*} C (A P) C (P)^{- 1}) = tr (V^{*} A^{*} A P V - V^{*} C (A P)^{*} C (A P) C (P)^{- 1} V) .

tr (\tilde{A}^{*} \tilde{A} \tilde{P} - C (\tilde{A} \tilde{P})^{*} C (\tilde{A} \tilde{P}) C (\tilde{P})^{- 1}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMatrix Theory and Algorithms · Advanced Topics in Algebra · Holomorphic and Operator Theory

Full text

Orthogonality to matrix subspaces, and a distance formula

Priyanka Grover

*Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, Delhi Centre, 7, S.J.S. Sansanwal Marg, New Delhi-110016, India

Email: [email protected]*

Abstract

We obtain a necessary and sufficient condition for a matrix $A$ to be Birkhoff-James orthogonal to any subspace $\mathscr{W}$ of $\mathbb{M}_{n}(\mathbb{C})$ . Using this we obtain an expression for the distance of $A$ from any unital $C^{*}$ subalgebra of $\mathbb{M}_{n}(\mathbb{C})$ .

*AMS classification: * 15A60, 15A09, 47A12

*Keywords: * Birkhoff-James orthogonality, Subdifferential, Singular value decomposition, Moore-Penrose inverse, Pinching, Variance.

1 Introduction

Let $\mathbb{M}_{n}(\mathbb{C})$ be the space of $n\times n$ complex matrices and let $\mathscr{W}$ be any subspace of $\mathbb{M}_{n}(\mathbb{C})$ . For any $A\in\mathbb{M}_{n}(\mathbb{C})$ , let

[TABLE]

be the operator norm of $A$ . Then $A$ is said to be (Birkhoff-James) orthogonal to $\mathscr{W}$ if

[TABLE]

The space $\mathbb{M}_{n}(\mathbb{C})$ is a complex Hilbert space under the inner product $\langle A,B\rangle_{c}={\operatorname{tr\ }}(A^{*}B)$ and a real Hilbert space under the inner product $\langle A,B\rangle_{r}={\operatorname{Re}}\ {\operatorname{tr\ }}(A^{*}B)$ . Let $\mathscr{W}^{\perp}$ be the orthogonal complement of $\mathscr{W}$ , where the orthogonal complement is with respect to the usual Hilbert space orthogonality in $\mathbb{M}_{n}(\mathbb{C})$ with the inner product $\langle\cdot,\cdot\rangle_{c}$ or $\langle\cdot,\cdot\rangle_{r}$ , depending upon whether $\mathscr{W}$ is a real or complex subspace. Note that if $A\in\mathscr{W}^{\perp}$ such that ${\operatorname{tr\ }}(A^{*}A)=\|A\|^{2}$ , then $A$ is orthogonal to $\mathscr{W}$ .

Bhatia and $\check{\text{S}}$ emrl [6] obtained an interesting characterisation of orthogonality when $\mathscr{W}=\mathbb{C}B$ , where $B$ is any matrix in $\mathbb{M}_{n}(\mathbb{C})$ . They showed that $A$ is orthogonal to $\mathbb{C}B$ if and only if there exists a unit vector $x$ such that $\|Ax\|=\|A\|$ and $\langle Ax,Bx\rangle=0$ . In other words, $A$ is orthogonal to $\mathbb{C}B$ if and only if there exists a positive semidefinite matrix $P$ of rank one such that ${\operatorname{tr\ }}P=1,\ {\operatorname{tr\ }}A^{*}AP=\|A\|^{2}$ and $AP\in(\mathbb{C}B)^{\perp}.$ Such positive semidefinite matrices with trace 1 are called density matrices. We use the notation $P\geq 0$ to mean $P$ is positive semidefinite.

Let $\mathscr{W}=\mathbb{D}_{n}(\mathbb{R}),$ the subspace of all diagonal matrices with real entries, and let $A$ be any Hermitian matrix. Then $A$ is called minimal if $\|A+D\|\geq\|A\|$ for all $D\in\mathbb{D}_{n}(\mathbb{R})$ . Andruchow, Larotonda, Recht, and Varela [1, Theorem 1] showed that a Hermitian matrix $A$ is minimal if and only if there exists a density matrix $P$ such that $\ PA^{2}=\|A\|^{2}P$ and all diagonal entries of $PA$ are zero. In our notation, $A$ is minimal is same as saying that $A$ is orthogonal to the subspace $\mathbb{D}_{n}(\mathbb{R})$ . If $A$ is Hermitian, then note that $A$ is orthogonal to $\mathbb{D}_{n}(\mathbb{R})$ if and only if $A$ is orthogonal to $\mathbb{D}_{n}(\mathbb{C})$ . Now $\mathbb{D}_{n}(\mathbb{C})^{\perp}$ is the subspace of all matries such that their diagonal entries are zero. The condition $\ PA^{2}=\|A\|^{2}P$ is same as $A^{2}P=\|A\|^{2}P$ and diagonal entries of $PA$ are same as diagonal entries of $AP$ . Therefore Theorem 1 in [1] can be interpreted as follows. A Hermitian matrix $A$ is orthogonal to $\mathbb{D}_{n}(\mathbb{C})$ if any only if $A^{2}P=\ |A\|^{2}P$ and $AP\in\mathbb{D}_{n}(\mathbb{C})^{\perp}$ . The following theorem is a generalization of this result as well as Bhatia- $\check{\text{S}}$ emrl theorem.

Theorem 1.

Let $A\in\mathbb{M}_{n}(\mathbb{C})$ and let $m(A)$ be the multiplicity of the maximum singular value $\|A\|$ of $A$ . Let $\mathscr{W}$ be any (real or complex) subspace of $\mathbb{M}_{n}(\mathbb{C}).$ Then $A$ is orthogonal to $\mathscr{W}$ if and only if there exists a density matrix $P$ of complex rank at most $m(A)$ such that $A^{*}AP=\|A\|^{2}P$ and $AP\in\mathscr{W}^{\perp}$ . (If rank $P=\ell$ , then $P$ has the form $P=\displaystyle\sum_{i=1}^{\ell}t_{i}v_{(i)}v_{(i)}^{*}$ where $v_{(i)}$ are unit vectors such that $A^{*}Av_{(i)}=\|A\|^{2}v_{(i)}$ and $t_{i}$ are such that $0\leq t_{i}\leq 1$ and $\displaystyle\sum_{i=1}^{\ell}t_{i}=1$ .)

Here, $m(A)$ is the best possible upper bound on rank $P$ . This has been illustrated later in Remark 4 in Section 4. When $\mathscr{W}=\mathbb{C}B$ , the above theorem says that $A$ is orthogonal to $\mathbb{C}B$ if and only if there exists a $P\geq 0$ of the form $P=\displaystyle{\sum_{i=1}^{\ell}}t_{i}v_{(i)}v_{(i)}^{*}$ such that $\|v_{(i)}\|=1$ , $A^{*}Av_{(i)}=\|A\|^{2}v_{(i)}$ and $\displaystyle\sum_{i=1}^{\ell}t_{i}\langle B^{*}Av_{(i)},v_{(i)}\rangle=0$ . By the Hausdorff-Toeplitz theorem, we get a unit vector $v$ such that $A^{*}Av=\|A\|^{2}v$ and $\langle B^{*}Av,v\rangle=0$ . The first condition is stronger than that in [6, Theorem 1.1].

Let ${\operatorname{dist}}(A,\mathscr{W})$ denote the distance of a matrix $A$ from the subspace $\mathscr{W}$ , defined as

[TABLE]

Audenaert [2] showed that when $\mathscr{W}=\mathbb{C}I$ , then

[TABLE]

Further the maximisation over $P$ on the right hand side of (2) can be restricted to density matrices of rank 1. The quantity ${\operatorname{tr\ }}(A^{*}AP)-|{\operatorname{tr\ }}(AP)|^{2}$ is called the variance of $A$ with respect to the density matrix $P$ . Bhatia and Sharma [7] showed that if $\Phi:\mathbb{M}_{n}(\mathbb{C})\rightarrow\mathbb{M}_{k}(\mathbb{C})$ is any positive unital linear map, then

[TABLE]

By choosing $\Phi(A)={\operatorname{tr\ }}(AP)$ for different density matrices $P$ , they obtained various interesting bounds on ${\operatorname{dist}}(A,\mathbb{C}I)^{2}.$

It would be interesting to have a generalisation of (2) with $\mathbb{C}I$ replaced by any unital $C^{*}$ subalgebra of $\mathbb{M}_{n}(\mathbb{C})$ . (This problem has also been raised by M. Rieffel in [13].) Let $\mathscr{B}$ be any unital $C^{*}$ subalgebra of $\mathbb{M}_{n}(\mathbb{C})$ . Let $\mathcal{C}_{\mathscr{B}}:\mathbb{M}_{n}(\mathbb{C})\rightarrow\mathscr{B}$ denote the projection of $\mathbb{M}_{n}(\mathbb{C})$ onto $\mathscr{B}$ . We note that $\mathcal{C}_{\mathscr{B}}$ is a bimodule map:

[TABLE]

In particular, when $\mathscr{B}$ is the subalgebra of block diagonal matrices, the matrix $\mathcal{C}_{\mathscr{B}}(X)$ is called a pinching of $X$ and is denoted by $\mathcal{C}(X)$ . It is defined as follows. If $X=\left[\begin{array}[]{cccc}X_{11}&\cdots&X_{1k}\\ X_{21}&\cdots&X_{2k}\\ \vdots&\vdots&\vdots\\ X_{k1}&\cdots&X_{kk}\end{array}\right]$ then

[TABLE]

Properties of pinchings are studied in detail in [3] and [4].

Our next result provides a generalisation of (2) for distance of $A$ to any unital $C^{*}$ subalgebra of $\mathbb{M}_{n}(\mathbb{C})$ .

Theorem 2.

Let $\mathscr{B}$ be any unital $C^{*}$ subalgebra of $\mathbb{M}_{n}(\mathbb{C})$ . Let $\mathcal{C}_{\mathscr{B}}:\mathbb{M}_{n}(\mathbb{C})\rightarrow\mathscr{B}$ denote the projection of $\mathbb{M}_{n}(\mathbb{C})$ onto $\mathscr{B}$ .

Then

[TABLE]

where $\mathcal{C}_{\mathscr{B}}(P)^{-1}$ denotes the Moore-Penrose inverse of $\mathcal{C}_{\mathscr{B}}(P)$ . The maximum on the right hand side of (5) can be restricted to rank $P\leq m(A)$ .

We prove Theorem 1 using ideas of subdifferential calculus. A brief summary of these is given in Section 2. The proofs are given in Section 3.

2 Preliminaries

Let $X$ be a complex Hilbert space. Let $f:X\rightarrow\mathbb{R}$ be a convex function. Then the subdifferential of $f$ at any point $x\in X$ , denoted by $\partial f(x)$ , is the set of $v^{*}\in X^{*}$ such that

[TABLE]

It follows from (6) that $f$ is minimized at $x$ if and only if $0\in\partial f(x)$ .

We use an idea similar to the one in [8, Theorem 2.1]. Let $f(W)=\|A+W\|$ . This is the composition of two functions namely $W\rightarrow A+W$ from $\mathscr{W}$ into $\mathbb{M}_{n}(\mathbb{C})$ and $T\rightarrow\|T\|$ from $\mathbb{M}_{n}(\mathbb{C})$ into $\mathbb{R}_{+}$ . Thus we need to find subdifferentials of composition maps. For that we need a chain rule.

Proposition 1.

Let $X,Y$ be any two Hilbert spaces. Let $g:Y\rightarrow\mathbb{R}$ be a convex function. Let $S:X\rightarrow Y$ be a linear map and let $L:X\rightarrow Y$ be the affine map defined by $L(x)=S(x)+y_{0}$ , for some $y_{0}\in Y$ . Then

[TABLE]

where $S^{*}$ is the adjoint of $S$ defined as

[TABLE]

In our setting, $g$ is the map $T\rightarrow\|T\|$ . The subdifferential of this map has been calculated by Watson [14].

Proposition 2.

Let $A\in\mathbb{M}_{n}(\mathbb{C})$ . Then

[TABLE]

where $\mathop{{\rm conv}}D$ denotes the convex hull of a set $D$ .

These elementary facts can be found in [11]. In this book the author deals with convex functions $f:\mathbb{R}^{n}\rightarrow\mathbb{R}$ . The same proofs can be extended to functions $f:X\rightarrow\mathbb{R}$ , where $X$ is any Hilbert space.

3 Proofs

Proof of Theorem 1 Suppose there exists a positive semidefinite $P$ with ${\operatorname{tr\ }}P=1$ such that $A^{*}AP=\|A\|^{2}P$ and $AP\in\mathscr{W}^{\perp}$ . Then for any $W\in\mathscr{W}$

[TABLE]

Now for any $T\in\mathbb{M}_{n}(\mathbb{C})$ ,

[TABLE]

where $\|\cdot\|_{1}$ denotes the trace norm. So,

[TABLE]

Since $AP\in\mathscr{W}^{\perp}$ , we have ${\operatorname{Re}}\ {\operatorname{tr\ }}(A^{*}WP)={\operatorname{Re}}\ {\operatorname{tr\ }}(W^{*}AP)=0$ . The matrices $W^{*}W$ and $P$ are positive semidefinite, therefore ${\operatorname{tr\ }}(W^{*}WP)\geq 0$ and by our assumption, ${\operatorname{tr\ }}(A^{*}AP)=\|A\|^{2}$ . Using these in (10) we get that $\|A+W\|^{2}\geq\|A\|^{2}$ .

Conversely, suppose

[TABLE]

Let $S:\mathscr{W}\rightarrow\mathbb{M}_{n}(\mathbb{C})$ be the inclusion map. Then $S^{*}:\mathbb{M}_{n}(\mathbb{C})\rightarrow\mathscr{W}$ is the projection onto the subspace $\mathscr{W}$ . Let $L:\mathscr{W}\rightarrow\mathbb{M}_{n}(\mathbb{C})$ be the map defined as

[TABLE]

Let $g:\mathbb{M}_{n}(\mathbb{C})\rightarrow\mathbb{R}$ be the map taking an $n\times n$ matrix $W$ to $\|W\|$ . Then (11) can be rewritten as

[TABLE]

that is, $g\circ L$ is minimized at [math]. Therefore $0\in\partial(g\circ L)(0)$ . Using Proposition 1, we get

[TABLE]

By Proposition 2,

[TABLE]

From (12) and (13) it follows that there exist unit vectors $u_{(i)},v_{(i)}$ such that $Av_{(i)}=\|A\|u_{(i)}$ and numbers $t_{i}$ such that $0\leq t_{i}\leq 1$ , $\sum t_{i}=1$ and

[TABLE]

Let $P=\sum t_{i}v_{(i)}v_{(i)}^{*}$ . Then $P\geq 0$ and ${\operatorname{tr\ }}P=1$ . Note that

[TABLE]

So, from (14) we get $S^{*}(AP)=0$ , that is, $AP\in\mathscr{W}^{\perp}$ . Since each $v_{(i)}$ is a right singular vector for $A$ , we have $A^{*}Av_{(i)}=\|A\|^{2}v_{(i)}$ . Using this we obtain

[TABLE]

Now let $m(A)=k$ . We now show that if $P$ satisfies (15), then rank $P\leq k$ . First note that $A^{*}A$ and $P$ commute and therefore can be diagonalised simultaneously. So we can assume $A^{*}A$ and $P$ in (15) to be diagonal matrices. By hypothesis $k$ of the diagonal entries of $A^{*}A$ are equal to $\|A\|^{2}$ . Let $A^{*}A=\left[\begin{array}[]{ccccccc}\|A\|^{2}&&&&&&\\ &\ddots&&&&&\\ &&\|A\|^{2}&&&\\ &&&s_{k+1}^{2}&&\\ &&&&\ddots&\\ &&&&&s_{n}^{2}\end{array}\right],$ where $s_{j}<\|A\|$ for all $k+1\leq j\leq n$ . If $P=\left[\begin{array}[]{ccc}p_{1}&&\\ &\ddots&\\ &&p_{n}\end{array}\right],$ then from (15) we obtain

[TABLE]

So $p_{j}=0$ for all $k+1\leq j\leq n$ . Hence rank $P\leq k$ . ∎

Proof of Theorem 2 We first show that it is sufficient to prove the result when $\mathscr{B}$ is a subalgebra of block diagonal matrices in $\mathbb{M}_{n}(\mathbb{C})$ . If $\mathscr{B}$ is any subalgebra of $\mathbb{M}_{n}(\mathbb{C})$ then there exist $n_{1},n_{2},\ldots,n_{k}$ with $\sum_{i}n_{i}=n$ such that $\mathscr{B}$ is $*$ -isomorphic to $\oplus_{i}\mathbb{M}_{n_{i}}(\mathbb{C})$ , the $*$ -isomorphism $\varphi:\mathscr{B}\rightarrow\oplus_{i}\mathbb{M}_{n_{i}}(\mathbb{C})$ being $\varphi(X)=V^{*}XV$ for some unitary matrix $V\in\mathbb{M}_{n}(\mathbb{C})$ (see [9, p. 249], [10, p. 74]). By definition

[TABLE]

Let $\tilde{A}$ denote the matrix $V^{*}AV$ . Since $\|\cdot\|$ is unitarily invariant, we get

[TABLE]

Next we show that for any density matrix $P$ ,

[TABLE]

where $\mathcal{C}$ is the pinching map as defined in (4). Since

[TABLE]

we have

[TABLE]

Now note that for any $X\in\mathbb{M}_{n}(\mathbb{C}),$ $V^{*}\mathcal{C}(X)V=\mathcal{C}(V^{*}XV)$ . . Therefore the above expression is same as

[TABLE]

This gives (17). So it is enough to prove (5) when $\mathscr{B}$ is a subalgebra of block diagonal matrices. We first show that

[TABLE]

Let $P$ be any density matrix. Then ${\operatorname{tr\ }}(A^{*}AP)\leq\|A\|^{2}.$ Therefore

[TABLE]

Let $B\in\mathscr{B}$ . Applying the translation $A\rightarrow A+B$ in (21) we get

[TABLE]

We show that the expression on the left hand side is invariant under this translation. By expanding the expression on the left hand side of (22), we get

[TABLE]

We show that except for the first term, $\left({\operatorname{tr\ }}\left(A^{*}AP-\mathcal{C}(AP)^{*}\ \mathcal{C}(AP)\ \mathcal{C}(P)^{-1}\right)\right)$ , the rest of the terms in (23) are zero. We shall prove that the second term

[TABLE]

in (23) is zero. The proof for the other two terms is similar.

By using (3), the expression in (24) is equal to

[TABLE]

By (18) this is equal to

[TABLE]

If $\mathcal{C}(P)$ is invertible then this is clearly zero. So let $\mathcal{C}(P)$ be not invertible. This means that if $\mathcal{C}(P)=\left[\begin{array}[]{ccc}P_{1}&&\\ &\ddots&\\ &&P_{k}\end{array}\right]$ , then there exists $i,1\leq i\leq k,$ such that $P_{i}$ is not invertible. Let $U$ denote the block diagonal unitary matrix

[TABLE]

where $U_{i}=I,$ if $P_{i}$ is invertible and $U_{i}^{*}P_{i}U_{i}=\left[\begin{array}[]{ccc}\Lambda_{i}&\\ &O\\ \end{array}\right],$ if $P_{i}$ is not invertible. (Here $\Lambda_{i}$ is the diagonal matrix with eigenvalues of $P_{i}$ as its diagonal entries.) Let $X^{\prime}$ denote the matrix $U^{*}XU$ . Then from (3) and (18), we get that the expression in (25) is same as

[TABLE]

Now $\mathcal{C}(P^{\prime})=\left[\begin{array}[]{ccccc}\Lambda_{1}&&&&\\ &O&&&\\ &&\Lambda_{2}&&\\ &&&O&\\ &&&&\ddots\end{array}\right]$ . Write $A^{\prime}$ and $P^{\prime}$ as $2k$ -block matrices, $A^{\prime}=(A^{\prime}_{rs})_{r,s=1,\ldots,2k}\text{ and }P^{\prime}=(P^{\prime}_{rs})_{r,s=1,\ldots,2k},$ respectively such that whenever $P_{i}$ is not invertible, we have $P_{2i-1,2i-1}^{\prime}=\Lambda_{i}$ and $P_{2i,2i}^{\prime}=O$ .

The $(r,r)$ -entry of $A^{\prime}P^{\prime}$ is $\displaystyle\sum_{s=1}^{2k}A^{\prime}_{rs}P^{\prime}_{sr}$ . Suppose $P^{\prime}_{rr}=O$ . Since $P^{\prime}\geq 0$ , we have $P^{\prime}_{rs}=P^{\prime}_{sr}=O$ for all $s=1,\ldots,2k$ . Hence the $(r,r)$ -entry of $A^{\prime}P^{\prime}$ is zero. So let $P^{\prime}_{rr}\neq O$ . Then the $(r,r)$ -entry of $\left(I-\mathcal{C}(P^{\prime})^{-1}\mathcal{C}(P^{\prime})\right)$ is zero. Therefore the expression in (27) is zero, and hence the expression in (25) is zero. Therefore from (22), we obtain

[TABLE]

for all $B\in\mathscr{B}$ and for all density matrices $P$ . Equation (20) now follows from here.

To show equality in (20), let ${\operatorname{dist}}(A,\mathscr{B})=\|A_{0}\|,$ where $A_{0}=A-B_{0}$ for some $B_{0}\in\mathscr{B}$ . Then $A_{0}$ is orthogonal to $\mathscr{B}$ . By Theorem 1 there exists a density matrix $P$ such that

[TABLE]

and

[TABLE]

From (28) we get that

[TABLE]

By using (3), we obtain

[TABLE]

Substituting (29) in (30) we get

[TABLE]

Now consider ${\operatorname{tr\ }}\left(\mathcal{C}(AP)^{*}\ \mathcal{C}(AP)\ \mathcal{C}(P)^{-1}\right)$ . From (29) we see that this is same as ${\operatorname{tr\ }}\left(B_{0}^{*}B_{0}\ \mathcal{C}(P)\mathcal{C}(P)^{-1}\mathcal{C}(P)\right)$ . If $\mathcal{C}(P)$ is invertible, then this is equal to ${\operatorname{tr\ }}(B_{0}^{*}B_{0}P)$ . If $\mathcal{C}(P)$ is not invertible, then we define $U$ as done in (26). From (3) and (18), we obtain

[TABLE]

By definition of $U$ , this is equal to ${\operatorname{tr\ }}\left(B_{0}^{\prime*}\ B^{\prime}_{0}\ \mathcal{C}(P^{\prime})\right)$ , which again by (3) and (18), is same as ${\operatorname{tr\ }}\left(B_{0}^{*}\ B_{0}\ \mathcal{C}(P)\right)$ . Therefore from (31) we have

[TABLE]

4 Remarks

It is clear from the proof of Theorem 1 that the condition $A^{*}AP=\|A\|^{2}P$ can be replaced by the weaker condition ${\operatorname{tr\ }}(A^{*}AP)=\|A\|^{2}$ in the statement of Theorem 1. 2. 2.

As one would expect, the set $\{A:\|A+W\|\geq\|A\|\text{ for all }W\in\mathscr{W}\}$ need not be a subspace. As an example consider the subspace $\mathscr{W}=\mathbb{C}I$ of $\mathbb{M}_{3}(\mathbb{C})$ . Let $A_{1}=\left[\begin{array}[]{ccc}0&1&0\\ 1&0&1\\ 0&1&0\end{array}\right]$ and $A_{2}=\left[\begin{array}[]{ccc}0&0&1\\ 0&0&0\\ 1&0&0\end{array}\right]$ . It can be checked from Theorem 1 that $A_{1},A_{2}$ are orthogonal to $\mathscr{W}$ . (Take $P=\left[\begin{array}[]{ccc}0&0&0\\ 0&1&0\\ 0&0&0\end{array}\right]$ for $A_{1}$ and $P=\left[\begin{array}[]{ccc}1&0&0\\ 0&0&0\\ 0&0&0\end{array}\right]$ for $A_{2}$ , respectively.) Then $A_{1}+A_{2}=\left[\begin{array}[]{ccc}0&1&1\\ 1&0&1\\ 1&1&0\end{array}\right]$ , and $\|A_{1}+A_{2}\|=2$ . But $\left\|A_{1}+A_{2}-\frac{1}{2}I\right\|=\frac{3}{2}<\|A_{1}+A_{2}\|$ . Hence $A_{1}+A_{2}$ is not orthogonal to $\mathscr{W}$ . 3. 3.

Let $\mathscr{W}=\{X:{\operatorname{tr\ }}X=0\}$ . Then $\mathscr{W}^{\perp}=\mathbb{C}I$ . In Section 1, we stated that if $A\in\mathscr{W}^{\perp}$ such that ${\operatorname{tr\ }}(A^{*}A)=\|A\|^{2}$ then $A$ is orthogonal to $\mathscr{W}$ . Therefore all the scalar matrices are orthogonal to $\mathscr{W}$ . We show that if $A\notin\mathbb{C}I$ then there exists a matrix $W$ with ${\operatorname{tr\ }}W=0$ such that $\|A+W\|<\|A\|$ . Let $\mathcal{D}A$ and $\mathcal{O}A$ denote the diagonal and off-diagonal parts of $A$ , respectively. Then $\mathcal{O}A\in\mathscr{W}$ , $A-\mathcal{O}A=\mathcal{D}A$ and $\|\mathcal{D}A\|\leq\|A\|$ . So it is enough to find $W\in\mathscr{W}$ such that $\|\mathcal{D}A+W\|<\|\mathcal{D}A\|$ . Let $\mathcal{D}A={\operatorname{diag}}\left(a_{1},\ldots,a_{1},a_{2},\ldots,a_{2},\ldots,a_{k},\ldots,a_{k}\right),$ where each $a_{j}$ occurs on the diagonal $n_{j}$ times and $n_{1}+\cdots+n_{k}=n$ . Assume $\|\mathcal{D}A\|=1$ . Take $W={\operatorname{diag}}\left(\frac{a_{2}-a_{1}}{kn_{1}},\ldots,\frac{a_{2}-a_{1}}{kn_{1}},\frac{a_{3}-a_{2}}{kn_{2}},\ldots,\frac{a_{3}-a_{2}}{kn_{2}},\ldots,\frac{a_{k}-a_{k-1}}{kn_{k-1}},\ldots,\frac{a_{k}-a_{k-1}}{kn_{k-1}},\right.\\ \left.\frac{a_{1}-a_{k}}{kn_{k}},\ldots,\frac{a_{1}-a_{k}}{kn_{k}}\right).$ Then $W$ has trace zero and $\mathcal{D}A+W={\operatorname{diag}}\left(\frac{(n_{1}-1)a_{1}+a_{2}}{n_{1}},\ldots,\right.\\ \left.\frac{(n_{1}-1)a_{1}+a_{2}}{n_{1}},\right.$ $\left.\frac{(n_{2}-1)a_{2}+a_{3}}{n_{2}},\ldots,\frac{(n_{2}-1)a_{2}+a_{3}}{n_{2}},\ldots,\frac{(n_{k}-1)a_{k-1}+a_{k}}{n_{k}},\ldots,\frac{(n_{k}-1)a_{k-1}+a_{k}}{n_{k}}\right).$ It is easy to check that $\|\mathcal{D}A+W\|<1$ . Hence for this particular $\mathscr{W}$ we have that $\{A:\|A+W\|\geq\|A\|\text{ for all }W\in\mathscr{W}\}=\mathscr{W}^{\perp}=\mathbb{C}I$ . 4. 4.

In Theorem 1, $m(A)$ is the best possible upper bound on rank $P$ . Consider $\mathscr{W}=\{X:{\operatorname{tr\ }}X=0\}$ . From Remark 2, we get that if a matrix $A$ is orthogonal to $\mathscr{W}$ then it has to be of the form $A=\lambda I$ , for some $\lambda\in\mathbb{C}$ . When $A\neq 0$ then $m(A)=n$ . Let $P$ be any density matrix satisfying $AP\in\mathscr{W}^{\perp}$ . Then $AP=\mu I$ , for some $\mu\in\mathbb{C},\mu\neq 0$ . If $P$ also satisfies $A^{*}AP=\|A\|^{2}P$ , then we get $P=\frac{\mu}{\lambda}I$ . Hence rank $P=n=m(A)$ . 5. 5.

For $n=2$ and $\mathscr{B}$ any subalgebra of $\mathbb{M}_{2}(\mathbb{C})$ , we can restrict maximum on the right hand side of (5) over rank one density matrices. By the same argument as in the proof of Theorem 2 it is sufficient to prove this for $\mathbb{D}_{2}(\mathbb{C})$ , the subalgebra of diagonal matrices with complex entries. We show

[TABLE]

where $\Delta$ is the projection onto $\mathbb{D}_{2}(\mathbb{C})$ . From Theorem 2 we have

[TABLE]

Note that

[TABLE]

Let $A=\left[\begin{array}[]{ccc}a&b\\ c&d\end{array}\right]$ and without loss of generality assume that $|b|\geq|c|$ . Then $\|\mathcal{O}A\|=|b|$ . For $x=\left[\begin{array}[]{ccc}0\\ 1\end{array}\right]$

[TABLE]

Combining this with (33), we obtain

[TABLE] 6. 6.

For $n=2$ and $\mathscr{B}$ any subalgebra of $\mathbb{M}_{2}(\mathbb{C})$ , we note that

[TABLE]

Again it is enough to show that

[TABLE]

If $A$ is an off-diagonal $2\times 2$ matrix, that is, $A=\left[\begin{array}[]{ccc}0&b\\ c&0\end{array}\right]$ then by Theorem 2.1 in [5] we obtain $\|A+W\|\geq\|A\|\text{ for all }W\in\mathbb{D}_{2}(\mathbb{C})$ . Conversely let $A\in\mathbb{M}_{n}(\mathbb{C})$ be such that $\|A+W\|\geq\|A\|\text{ for all }W\in\mathbb{D}_{2}(\mathbb{C})$ . Then by taking $W=-\mathcal{D}A$ , we have $A+W=\mathcal{O}(A)$ . Again by using Theorem 2.1 in [5] we obtain that $\|\mathcal{O}(A)\|=\|A\|$ . So $A$ is of the form $\left[\begin{array}[]{ccc}a&b\\ c&d\end{array}\right]$ , where $\|A\|=\max\{|b|,|c|\}.$ Since norm of each row and each colum is less than or equal to $\|A\|$ , we get that $a=d=0$ . Hence $A\in\mathbb{D}_{2}(\mathbb{C})^{\perp}$ .

Acknowledgement. I would like to thank Professor Rajendra Bhatia for several useful discussions and Professor Ajit Iqbal Singh for helpful comments in this paper.

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] E. Andruchow, G. Larotonda, L. Recht, A. Varela, A characterization of minimal Hermitian matrices, Linear Algebra Appl. 436 (2012) 2366–2374.
2[2] K.M.R. Audenaert, Variance bounds, with an application to norm bounds for commutators, Linear Algebra Appl. 432 (2010) 1126–1143.
3[3] R. Bhatia, Matrix Analysis , Springer, 1997.
4[4] R. Bhatia, Positive Definite Matrices , Princeton University Press, 2007.
5[5] R. Bhatia, M-D. Choi, C. Davis, Comparing a matrix to its off-diagonal part, Oper. Theory Adv. Appl. 40/41 (1989) 151–164.
6[6] R. Bhatia, P. S ˇ ˇ S \check{\text{S}} emrl, Orthogonality of matrices and some distance problems, Linear Algebra Appl. 287 (1999) 77–86.
7[7] R. Bhatia, R. Sharma, Some inequalities for positive linear maps, Linear Algebra Appl. 436 (2012) 1562–1571.
8[8] T. Bhattacharyya, P. Grover, Characterization of Birkhoff-James orthogonality, J. Math. Anal. Appl. 407 (2013) 350–358.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Orthogonality to matrix subspaces, and a distance formula

Abstract

1 Introduction

Theorem 1**.**

Theorem 2**.**

2 Preliminaries

Proposition 1**.**

Proposition 2**.**

3 Proofs

4 Remarks

Theorem 1.

Theorem 2.

Proposition 1.

Proposition 2.