The star-center of the quaternionic numerical range

Lu\'is Carvalho; Cristina Diogo; S\'ergio Mendes

arXiv:1907.13433·math.FA·August 29, 2019

The star-center of the quaternionic numerical range

Lu\'is Carvalho, Cristina Diogo, S\'ergio Mendes

PDF

TL;DR

This paper proves that the quaternionic numerical range is always star-shaped, identifies its star-center, and describes its geometric structure using tangents to the lower bild, advancing understanding of quaternionic matrix analysis.

Contribution

It establishes the star-shaped property of the quaternionic numerical range and characterizes its star-center through geometric analysis, a novel contribution in quaternionic matrix theory.

Findings

01

Quaternionic numerical range is always star-shaped.

02

Star-center determined by equivalence classes of the bild's star-center.

03

Geometric shape of the star-center's upper part is defined by two tangent lines.

Abstract

In this paper we prove that the quaternionic numerical range is always star-shaped and its star-center is given by the equivalence classes of the star-center of the bild. We determine the star-center of the bild, and consequently of the numerical range, by showing that the geometrical shape of the upper part of the center is defined by two lines, tangents to the lower bild.

Equations146

[X] = x \in X ⋃ [x]

[X] = x \in X ⋃ [x]

W (A) = {x^{*} A x : x \in S_{H^{n}}}

W (A) = {x^{*} A x : x \in S_{H^{n}}}

W (U^{*} A U) = W (A),

W (U^{*} A U) = W (A),

B (A) = W (A) \cap C .

B (A) = W (A) \cap C .

{w_{(p)}} = [w] \cap span {1, p}^{+} .

{w_{(p)}} = [w] \cap span {1, p}^{+} .

{w_{(i)}} = [w] \cap span {1, i}^{+} \subseteq B^{+}

{w_{(i)}} = [w] \cap span {1, i}^{+} \subseteq B^{+}

[h_{0}, h_{1}] = {(1 - α) h_{0} + α h_{1} : α \in [0, 1]} .

[h_{0}, h_{1}] = {(1 - α) h_{0} + α h_{1} : α \in [0, 1]} .

C (B) = {b_{0} \in B : [b_{0}, b] \subseteq B, for any b \in B} .

C (B) = {b_{0} \in B : [b_{0}, b] \subseteq B, for any b \in B} .

∣ c_{v} ∣ = ∣ α a_{v} + (1 - α) b_{v} ∣ \leq α ∣ a_{v} ∣ + (1 - α) ∣ b_{v} ∣.

∣ c_{v} ∣ = ∣ α a_{v} + (1 - α) b_{v} ∣ \leq α ∣ a_{v} ∣ + (1 - α) ∣ b_{v} ∣.

\omega=\alpha a_{(i)}+(1-\alpha)b_{(i)}=c_{r}+\big{(}\alpha|a_{v}|+(1-\alpha)|b_{v}|\big{)}i\in B^{+}.

\omega=\alpha a_{(i)}+(1-\alpha)b_{(i)}=c_{r}+\big{(}\alpha|a_{v}|+(1-\alpha)|b_{v}|\big{)}i\in B^{+}.

- \frac{ω _{v}}{i} \leq \frac{c _{(i), v}}{i} \leq \frac{ω _{v}}{i},

- \frac{ω _{v}}{i} \leq \frac{c _{(i), v}}{i} \leq \frac{ω _{v}}{i},

a_{1} = r + s q_{1} and a_{2} = r + s q_{2},

a_{1} = r + s q_{1} and a_{2} = r + s q_{2},

α a_{1} + (1 - α) \tilde{a}_{1} \tilde{˙} α a_{2} + (1 - α) \tilde{a}_{2} .

α a_{1} + (1 - α) \tilde{a}_{1} \tilde{˙} α a_{2} + (1 - α) \tilde{a}_{2} .

C (W \cap C) = C (W \cap C)^{*} .

C (W \cap C) = C (W \cap C)^{*} .

C (W) \cap C = C (W \cap C) .

C (W) \cap C = C (W \cap C) .

y=\Big{(}\alpha c_{r}+(1-\alpha)w_{r}\Big{)}+\Big{|}\alpha c_{v}+(1-\alpha)w_{v}\Big{|}q.

y=\Big{(}\alpha c_{r}+(1-\alpha)w_{r}\Big{)}+\Big{|}\alpha c_{v}+(1-\alpha)w_{v}\Big{|}q.

a_{r} = b_{r} = y_{r} and ∣ a_{v} ∣ \leq ∣ y_{v} ∣ \leq ∣ b_{v} ∣.

a_{r} = b_{r} = y_{r} and ∣ a_{v} ∣ \leq ∣ y_{v} ∣ \leq ∣ b_{v} ∣.

b

b

\displaystyle=y_{r}+\Big{(}\alpha|c_{v}|+(1-\alpha)|w_{v}|\Big{)}i\in W\cap\mathbb{C}^{+}

∣ b_{v} ∣ = α ∣ c_{v} ∣ + (1 - α) ∣ w_{v} ∣ \geq ∣ α c_{v} + (1 - α) w_{v} ∣ = ∣ y_{v} ∣.

∣ b_{v} ∣ = α ∣ c_{v} ∣ + (1 - α) ∣ w_{v} ∣ \geq ∣ α c_{v} + (1 - α) w_{v} ∣ = ∣ y_{v} ∣.

∣ y_{v} ∣^{2}

∣ y_{v} ∣^{2}

\displaystyle=\alpha^{2}|c_{v}|^{2}+(1-\alpha)^{2}|w_{v}|^{2}+\alpha(1-\alpha)\Big{(}\langle c_{v},w_{v}\rangle+\langle w_{v},c_{v}\rangle\Big{)}

\geq α^{2} ∣ c_{v} ∣^{2} + (1 - α)^{2} ∣ w_{v} ∣^{2} - 2 α (1 - α) ∣ c_{v} ∣∣ w_{v} ∣

\displaystyle=\big{(}\alpha|c_{v}|-(1-\alpha)|w_{v}|\big{)}^{2}.

|y_{v}|^{2}\geq\big{(}\alpha|c_{(i),v}|-(1-\alpha)|w_{(i),v}|\big{)}^{2}.

|y_{v}|^{2}\geq\big{(}\alpha|c_{(i),v}|-(1-\alpha)|w_{(i),v}|\big{)}^{2}.

∣ y_{v} ∣^{2}

∣ y_{v} ∣^{2}

\displaystyle=\Big{|}\Big{(}\alpha c_{(i)}^{*}+(1-\alpha)w_{(i)}\Big{)}_{v}\Big{|}^{2}

= ∣ a_{v} ∣^{2} .

\mathscr{C}\big{(}W\big{)}=\Big{[}\mathscr{C}(W\cap\mathbb{C})\Big{]}.

\mathscr{C}\big{(}W\big{)}=\Big{[}\mathscr{C}(W\cap\mathbb{C})\Big{]}.

C (W) \cap span {1, q} = C (W \cap span {1, q}) = C (W^{(q)}) .

C (W) \cap span {1, q} = C (W \cap span {1, q}) = C (W^{(q)}) .

c \in C (W) \Leftrightarrow c \in C (W^{(q)}), for some q \in S_{P} .

c \in C (W) \Leftrightarrow c \in C (W^{(q)}), for some q \in S_{P} .

c \in [C (W^{(q)})] = [C (W^{(i)})] = [C (W \cap C)] .

c \in [C (W^{(q)})] = [C (W^{(i)})] = [C (W \cap C)] .

\mathscr{C}\Big{(}\big{[}W\cap\mathbb{C}\big{]}\Big{)}=\Big{[}\mathscr{C}(W\cap\mathbb{C})\Big{]}.

\mathscr{C}\Big{(}\big{[}W\cap\mathbb{C}\big{]}\Big{)}=\Big{[}\mathscr{C}(W\cap\mathbb{C})\Big{]}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

The star-center of the quaternionic numerical range

Luís Carvalho

Luís Carvalho, ISCTE - Lisbon University Institute

Av. das Forças Armadas

1649-026, Lisbon

Portugal

[email protected]

,

Cristina Diogo

Cristina Diogo, ISCTE - Lisbon University Institute

Av. das Forças Armadas

1649-026, Lisbon

Portugal

and

Center for Mathematical Analysis, Geometry, and Dynamical Systems

Mathematics Department,

Instituto Superior Técnico, Universidade de Lisboa

Av. Rovisco Pais, 1049-001 Lisboa, Portugal

[email protected]

and

Sérgio Mendes

Sérgio Mendes, ISCTE - Lisbon University Institute

Av. das Forças Armadas

1649-026, Lisbon

Portugal

and Centro de Matemática e Aplicações

Universidade da Beira Interior

Rua Marquês d’Ávila e Bolama

6201-001, Covilhã

[email protected]

Abstract.

In this paper we prove that the quaternionic numerical range is always star-shaped and its star-center is given by the equivalence classes of the star-center of the bild. We determine the star-center of the bild, and consequently of the numerical range, by showing that the geometrical shape of the upper part of the center is defined by two lines, tangents to the lower bild.

Key words and phrases:

quaternions, numerical range, star-shapedness

2010 Mathematics Subject Classification:

15B33, 47A12

The second author was partially supported by FCT through project UID/MAT/04459/2013 and the third author was partially supported by FCT through CMA-UBI, project PEst-OE/MAT/UI0212/2013.

1. Introduction

Let $\mathbb{H}$ denote the skew-field of Hamilton quaternions. Let $A$ be a $n\times n$ matrix with quaternionic entries. It is well known that the numerical range $W_{\mathbb{H}}(A)=W(A)$ is a connected but not necessarily convex subset of the quaternions. The group of unitary quaternions $\mathbb{S}_{\mathbb{H}}$ acts on $\mathbb{H}$ by automorphisms. Since every class $[q]$ , $q\in\mathbb{H}$ , has a representative in $\mathbb{C}^{+}$ and each class of $q\in W(A)$ is contained in $W(A)$ , it became clear from the early studies of the quaternionic numerical range that it is enough to study the bild of $A$ , $B(A)=W(A)\cap\mathbb{C}$ or the upper-bild $B^{+}(A)=W(A)\cap\mathbb{C}^{+}$ . The latter has the advantage of being always convex whereas $B(A)$ is convex if, and only if, $W(A)$ is convex, see [Zh, page 53] and theorem 3.1. The convexity of the numerical range, the bild and upper bild has been studied by several authors, see [AY1, AY2, R, ST, STZ].

In the complex setting the numerical range is convex thanks to the celebrated Toeplitz-Hausdorff Theorem [GR]. Over the time, several generalizations of the numerical range have been proposed, namely the C-numerical range, the joint numerical range, among others, and in these cases convexity may fail. It then becomes natural to look for convexity-like geometric properties. For instance, the property of star-shapedness has been studied in [CT, LLPS18, LLPS19, LNT, LP]. We recall that star-shapedness of a set $B$ only requires that there is an element $b_{0}\in B$ such that every segment connecting $b_{0}$ and any other element of $B$ must be contained in $B$ , see definition 2.1. Accordingly, we say that $b_{0}$ is in the star-center of $B$ .

For some generalizations of the numerical range, the star-shapedness of the (complex) numerical range holds under certain conditions. In the article we tackle the question of the star-shapedness in the quaternionic setting. We prove that the quaternionic numerical range is always star-shaped. In addition, we characterize the shape of the star-center for quaternionic matrices.

The star-shapedness of the numerical range is a consequence of two simple facts (see theorem 3.4). Firstly, the convexity of the upper and lower bilds imply that the segments whose end is a real element of the bild is contained in the bild. Therefore the bild is star-shaped and the reals therein are part of its center. And secondly, the equality, up to isomorphism, of all two dimensional real subalgebras of the quaternions that include the reals (as a real subspace), leads us to the conclusion that the reals in $W(A)$ are in fact part of the (star) center of the numerical range.

As mentioned before, the general reason to focus on the bild is that the whole numerical range can be reconstructed from it by using similarity classes. Our result is in line with the elements of the bild being the building blocks of the numerical range. In fact, we prove in theorem 3.8 that the center of the numerical range is given by the similarity classes of the center of the bild. Therefore, we only need to know the center of the bild, and then to build the similarity classes to obtain the center of the numerical range. When the matrix is non hermitian the upper center (likewise for the lower center) is the region of the upper bild limited by two lines. These two lines are the tangents to the curve defining the boundary of the lower bild at the reals, see theorems 4.1, 4.3 and corollary 4.5. As a consequence of these results we establish a new proof of the important theorem by Au-Yeung [AY1, theorem 3], which establish a necessary and sufficient condition for convexity of the numerical range, see corollary 4.4. We conclude with an example where we explicitly compute the center.

2. Preliminaries

The quaternionic skew-field $\mathbb{H}$ is an algebra of rank $4$ over $\mathbb{R}$ with basis $\{1,i,j,k\}$ , where the product is given by $i^{2}=j^{2}=k^{2}=ijk=-1$ . For any $q=a_{0}+a_{1}i+a_{2}j+a_{3}k\in\mathbb{H}$ we denote by $q_{r}=a_{0}$ and $q_{v}=a_{1}i+a_{2}j+a_{3}k$ , the real and imaginary parts of $q$ , respectively. Let the pure quaternions be $\mathbb{P}=\mathrm{span}_{\mathbb{R}}\,\{i,j,k\}$ . The conjugate of $q$ is given by $q^{*}=q_{r}-q_{v}$ and the norm is defined by $|q|^{2}=qq^{*}$ . Two quaternions $q,q^{\prime}\in\mathbb{H}$ are called similar, if there exists a unitary quaternion $s$ such that $s^{*}q^{\prime}s=q$ . Similarity is an equivalence relation and we denote by $[q]$ the equivalence class containing $q$ . A necessary and sufficient condition for the similarity of $q$ and $q^{\prime}$ is given by $q_{r}=q^{\prime}_{r}\textrm{ and }|q_{v}|=|q^{\prime}_{v}|$ , see [R, theorem 2.2.6]. We will denote the set of all equivalence classes of the elements of a set $X\subseteq\mathbb{H}$ by $[X]$ . Then,

[TABLE]

Let $\mathbb{H}^{n}$ be the $n$ -dimensional $\mathbb{H}$ -space. The norm of $\boldsymbol{x}\in\mathbb{H}^{n}$ is $|\boldsymbol{x}|^{2}=\boldsymbol{x}^{*}\boldsymbol{x}$ . The disk with center $\boldsymbol{a}\in\mathbb{H}^{n}$ and radius $r>0$ is the set $\mathbb{D}_{\mathbb{H}^{n}}(\boldsymbol{a},r)=\{\boldsymbol{x}\in\mathbb{H}^{n}:|\boldsymbol{x}-\boldsymbol{a}|\leq r\}$ and its boundary is the sphere $\mathbb{S}_{\mathbb{H}^{n}}(\boldsymbol{a},r)$ . In particular, if $\boldsymbol{a}=\boldsymbol{0}$ and $r=1$ , we simply write $\mathbb{D}_{\mathbb{H}^{n}}$ and $\mathbb{S}_{\mathbb{H}^{n}}$ . With this notation, the group of unitary quaternions is $\mathbb{S}_{\mathbb{H}}$ whereas $\mathbb{S}_{\mathbb{P}}$ denotes the unit sphere over the pure quaternions.

Let $\mathscr{M}_{n}(\mathbb{H})$ be the set of all $n\times n$ matrices with entries over $\mathbb{H}$ . The set

[TABLE]

is called the quaternionic numerical range of $A$ in $\mathbb{H}$ . From the above definition we see that the quaternionic numerical range of $A\in\mathscr{M}_{n}(\mathbb{H})$ is the subset of $\mathbb{H}$ containing the images of the quadratic function $f_{A}(\boldsymbol{x})=\boldsymbol{x}^{*}A\boldsymbol{x}$ over the quaternionic unitary sphere, $\boldsymbol{x}\in\mathbb{S}_{\mathbb{H}^{n}}$ . The numerical range is invariant under unitary equivalence, i.e.

[TABLE]

for every unitary $U\in\mathscr{M}_{n}(\mathbb{H})$ [R, theorem 3.5.4].

It is well known that if $q\in W(A)$ then $[q]\subseteq W(A)$ , see [R, page 38]. This means that if $q_{1}\sim q_{2}$ and $q_{2}\in W(A)$ then $q_{1}\in W(A)$ . For simplicity we just say that $q_{2}$ belongs to $W(A)$ by similarity. Therefore, it is enough to study the subset of complex elements in each similarity class. This set is known as $B(A)$ , the bild of $A$ :

[TABLE]

We will freely use both notations $B(A)$ and $W(A)\cap\mathbb{C}$ for the bild of $A$ . Although the bild may not be convex, the upper bild $B^{+}=W(A)\cap\mathbb{C}^{+}$ is always convex, see [ST]. Analogously, the lower bild $B^{-}=W(A)\cap\mathbb{C}^{-}$ is also always convex. Note that $\mathbb{C}^{+}\cap\mathbb{C}^{-}=\mathbb{R}$ , $B=B^{+}\cup B^{-}$ and $B^{+}\cap B^{-}\subseteq\mathbb{R}$ .

For $p\in\mathbb{P}$ , let $\text{Span}\{1,p\}^{+}=\{\alpha+\beta p:\alpha\in\mathbb{R},\beta\in\mathbb{R}_{0}^{+}\}$ . For any $w\in W(A)$ and $p\in\mathbb{P}$ , let $w_{(p)}$ be the representative of the class $[w]$ in $\mathrm{span}\,\{1,p\}^{+}$ , that is,

[TABLE]

In particular,

[TABLE]

and we can write $w_{(i)}=w_{r}+i|w_{v}|$ .

Let $V\subseteq\mathbb{H}\cong\mathbb{R}^{4}$ be a real subspace of $\mathbb{H}$ . We denote by $\pi_{V}$ the canonical $\mathbb{R}$ -linear projection $\pi_{V}:\mathbb{H}\to V$ .

For $h_{0},h_{1}\in\mathbb{H}$ we will denote by $[h_{0},h_{1}]$ the set of convex linear combinations of $h_{0}$ and $h_{1}$ :

[TABLE]

Definition 2.1.

Let $B$ be a subset of a vector space. We say the set $B$ is star-shaped if there is a vector $b_{0}\in B$ such that $[b_{0},b]\subseteq B\,\,,\,\forall b\in B$ . The star-center of a set $B$ is defined to be

[TABLE]

For simplicity, we refer to the star-center of a set as the center.

3. Star-shapedness of the bild and numerical range

The upper bild and the bild fully specify the numerical range, but the first is considered better suited to represent the quaternionic numerical range. This is not only because it is convex but also because it has the advantage of containing one single element from each similarity class. In a sense, the upper bild can be interpreted as the set of equivalence classes for the similarity relation $\sim$ , that is, the quotient set $B^{+}=W/\sim$ . However, from the convexity of the upper bild we cannot infer about the convexity of the numerical range, as the first is always convex and the latter is not.

The first result of this paper relates the convexity of the bild with the convexity of the numerical range. This is a known result (see [Zh, page 53]), however we present a different proof based on elementary properties of the numerical range.

Theorem 3.1.

Let $A\in\mathscr{M}_{n}(\mathbb{H})$ . Then $W(A)\cap\mathbb{C}$ is convex if and only if $W(A)$ is convex.

Proof.

It is enough to prove that, if $W(A)\cap\mathbb{C}$ is convex then $W(A)$ is convex. Let $a,b\in W(A)$ and $\alpha\in[0,1]$ . We need to show that $c=\alpha a+(1-\alpha)b\in W(A)$ . The quaternion $c=c_{r}+c_{v}$ has $c_{r}=\alpha a_{r}+(1-\alpha)b_{r}$ and

[TABLE]

We will prove that $c_{(i)}\in B^{+}$ , thus proving by similarity that $c\in W(A)$ . Since the upper bild is convex,

[TABLE]

By similarity, $\omega^{*}\in B^{-}$ . Note that $c_{(i)}=c_{(i),r}+c_{(i),v}=c_{r}+i|c_{v}|$ . From (3.2), $c_{(i),r}=\omega_{r}=\omega^{*}_{r}$ and from (3.1), $|c_{(i),v}|\leq|\omega_{v}|$ . Therefore,

[TABLE]

and so there is $\beta\in[0,1]$ such that $c_{(i),v}=\beta\omega_{v}+(1-\beta)\omega_{v}^{*}$ . Hence, $c_{(i)}=\beta\omega+(1-\beta)\omega^{*}$ . By hypothesis, $W(A)\cap\mathbb{C}$ is convex and so $c_{(i)}\in W(A)\cap\mathbb{C}$ . ∎

Any quaternionic matrix $A\in\mathscr{M}_{n}(\mathbb{H})$ can be written as $A=\tilde{H}+\tilde{S}$ , with $\tilde{H}=\frac{A+A^{*}}{2}$ hermitian and $\tilde{S}=\frac{A-A^{*}}{2}$ skew-hermitian. Let $U\in\mathscr{M}_{n}(\mathbb{H})$ be the unitary matrix that diagonalize $\tilde{S}$ , i.e., $S=U^{*}\tilde{S}U=\mathrm{diag}\,(s_{1},\ldots,s_{n})$ . Since the numerical range is invariant under unitary equivalence, we can work with $U^{*}AU$ , that can be written in the form $U^{*}AU=U^{*}\tilde{H}U+U^{*}\tilde{S}U=H+S$ . Since $H$ is hermitian $f_{H}(\boldsymbol{x})\in\mathbb{R}$ and since $S$ is skew-hermitian the real part of $f_{S}(\boldsymbol{x})$ is zero, see [R, corollary 3.5.3].

We claim that $0\in W_{\mathbb{H}}(S)$ . To prove this we will find a vector $\boldsymbol{x}\in\mathbb{S}_{\mathbb{H}^{n}}$ such that $f_{S}(\boldsymbol{x})=0$ . Let $x_{3}=\ldots=x_{n}=0$ , then take $z_{1}$ and $z_{2}$ in $\mathbb{S}_{\mathbb{H}}$ such that $q_{1}=z_{1}^{*}s_{1}z_{1}\in\mathbb{C}^{+}$ and $q_{2}=z_{2}^{*}s_{2}z_{2}\in\mathbb{C}^{-}$ . The quaternions $q_{1}$ and $q_{2}$ are either zero or the representatives of $s_{1}$ in $\mathbb{C}^{+}$ and $s_{2}$ in $\mathbb{C}^{-}$ , respectively. Thus they are pure complex. Finally, choose $\beta\in[0,1]$ such that $\beta q_{1}+(1-\beta)q_{2}=0$ . Take $x_{1}=\beta^{1/2}z_{1}$ and $x_{2}=(1-\beta)^{1/2}z_{2}$ . Then, the vector $\boldsymbol{x}\in\mathbb{S}_{\mathbb{H}^{n}}$ is in the stated conditions. It is now clear that $W(A)\cap\mathbb{R}\neq\emptyset$ . In fact, take vector $\boldsymbol{x}$ and compute $f_{A}(\boldsymbol{x})=f_{H}(\boldsymbol{x})+f_{S}(\boldsymbol{x})=f_{H}(\boldsymbol{x})\in\mathbb{R}$ . We have proved the following result. 111This result is apparently known for some time, as it appears in the thesis of [Siu], supervised by Au-Yeung, however it has never been published before, (to the best of our knowledge). In spite of this, Au-Yeung in [AY1, corollary 1] apropos of the connectedness of $W_{\mathbb{H}}\cap\mathbb{R}$ , and citing a result from [J], states the possibility of $W_{\mathbb{H}}\cap\mathbb{R}=\emptyset$ . This possibility is also stated by [Zh, theorem 9.2] and [K, corollary 2.10], repeating again the same result by [J], (although [K] doesn’t cite it).

Proposition 3.2.

For any $A\in\mathscr{M}_{n}(\mathbb{H})$ , $W(A)\cap\mathbb{R}\neq\emptyset$ .

From now on, we fix a matrix with quaternionic entries, $A\in\mathscr{M}_{n}(\mathbb{H})$ , and we denote the quaternionic numerical range of $A$ simply by $W=W(A)$ .

Let $q_{1},q_{2}\in\mathbb{S}_{\mathbb{P}}$ . We say an element $a_{1}\in\mathrm{span}\{1,q_{1}\}$ is $\dot{\sim}$ -similar to $a_{2}\in\mathrm{span}\{1,q_{2}\}$ , if and only if, for some $r,s\in\mathbb{R}$ ,

[TABLE]

in which case we write $a_{1}\dot{\sim}a_{2}.$ We say that $A_{1}\subseteq\mathrm{span}\{1,q_{1}\}$ and $A_{2}\subseteq\mathrm{span}\{1,q_{2}\}$ are $\dot{\sim}$ -similar, and denote it by $A_{1}\dot{\sim}A_{2}$ , if and only if, for any $a_{1}\in A_{1}$ there is an $a_{2}\in A_{2}$ such that $a_{1}\dot{\sim}a_{2}$ , and vice versa. When two sets are $\dot{\sim}$ -similar they share some properties, namely convexity. In fact, if $A_{1}$ is convex we can conclude that $A_{2}$ is convex. Take any $a_{2},\tilde{a}_{2}\in A_{2}$ . Then, there are $a_{1},\tilde{a}_{1}\in A_{1}$ , such that $a_{1}\dot{\sim}a_{2}$ and $\tilde{a}_{1}\dot{\sim}\tilde{a}_{2}$ . For any $\alpha\in[0,1]$ it is a matter of simple calculations to note that

[TABLE]

Now, since $A_{1}\dot{\sim}A_{2}$ and $A_{1}$ is convex we conclude that $\alpha a_{2}+(1-\alpha)\tilde{a}_{2}\in A_{2}$ . Therefore $A_{2}$ is also convex. A similar argument proves that the centers are $\dot{\sim}$ -similar for any two $\dot{\sim}$ -similar sets $A_{1}$ and $A_{2}$ , since whenever a segment is in $A_{1}$ the $\dot{\sim}$ -similar segment must be in $A_{2}$ . That is, $\mathscr{C}(A_{1})\dot{\sim}\mathscr{C}(A_{2})$ whenever $A_{1}\dot{\sim}A_{2}$ .

Define, for $q\in\mathbb{S}_{\mathbb{P}}$ , $W^{(q)}=W\cap\mathrm{span}\{1,q\},$ $W^{(q)+}=W\cap\mathrm{span}\{1,q\}^{+}$ and $W^{(q)-}=W\cap\mathrm{span}\{1,q\}^{-}.$

Lemma 3.3.

For any $q_{1},q_{2}\in\mathbb{S}_{\mathbb{P}}$ we have:

(i)

$W^{(q_{1})+},W^{(q_{1})-}$ * are convex,* 2. (ii)

$\mathscr{C}\Big{(}W^{(q_{1})}\Big{)}\dot{\sim}\mathscr{C}\Big{(}W^{(q_{2})}\Big{)}.$ **

Proof.

The numerical range is such that, by similarity, $W^{(q_{1})}\dot{\sim}W^{(q_{2})}$ , for any $q_{1},q_{2}\in\mathbb{S}_{\mathbb{P}}$ . It is also an immediate conclusion of numerical range’s closedness to similarity that $W^{(q_{1})+}\dot{\sim}W^{(q_{2})+}$ , for any $q_{1},q_{2}\in\mathbb{S}_{\mathbb{P}}$ . It is known that the upper bild $W^{(i)+}=B^{+}$ is convex, thus from the previous discussion, we have that $W^{(q)+}$ is also convex for any $q\in\mathbb{S}_{\mathbb{P}}$ . Moreover from $W^{(q_{1})}\dot{\sim}W^{(q_{2})}$ we know that $\mathscr{C}\Big{(}W^{(q_{1})}\Big{)}\dot{\sim}\mathscr{C}\Big{(}W^{(q_{2})}\Big{)}$ . ∎

As a consequence of this lemma we only need to study the center of one of the $W^{(q)}$ ’s and the natural choice is to take $q=i$ , that is, we only need to study the center of the bild $B=W^{(i)}$ .

Theorem 3.4.

The quaternionic numerical range $W$ is star-shaped and $W\cap\mathbb{R}\subseteq\mathscr{C}(W)$ .

Proof.

By proposition 3.2, there is $r\in W(A)\cap\mathbb{R}$ . For every $\omega\in W$ , there is $q\in\mathbb{S}_{\mathbb{P}}$ such that $\omega\in W^{(q)}$ . Since $W^{(q)}=W^{(q)+}\cup W^{(q)-}$ and using lemma 3.3 we have that $[r,\omega]\subseteq W^{(q)}\subseteq W$ . Hence, the numerical range is star-shaped. Moreover, $W\cap\mathbb{R}\subseteq\mathscr{C}(W)$ . ∎

The numerical range $W(A)$ is contained in $\mathbb{R}$ if, and only if, $A$ is hermitian, see [R, corollary 3.5.3].. The next result follows trivially from theorem 3.4.

Corollary 3.5.

If $A$ is hermitian then $\mathscr{C}(W(A))=W(A)$ .

Lemma 3.6.

The center of the bild is closed under conjugation, i.e.

[TABLE]

Proof.

Assume $c\in\mathscr{C}(W\cap\mathbb{C})$ . Let $\omega$ be any element of the bild $\omega\in W\cap\mathbb{C}$ . Since the bild is closed for conjugation, $\omega^{*}\in W\cap\mathbb{C}$ . Then $c$ being in the center implies that $\alpha c+(1-\alpha)\omega^{*}\in W\cap\mathbb{C}$ , for any $\alpha\in[0,1]$ . And again using the bild’s closedness to conjugation we conclude that $\alpha c^{*}+(1-\alpha)\omega\in W\cap\mathbb{C}$ . Since this is true for any $\omega\in W\cap\mathbb{C}$ , $c^{*}\in\mathscr{C}(W\cap\mathbb{C})$ . The converse inclusion follows similar steps. ∎

We now establish the equality between the center of the bild and the complex part of the center of the numerical range.

Proposition 3.7.

We have:

[TABLE]

Proof.

The inclusion $\mathscr{C}(W)\cap\mathbb{C}\subseteq\mathscr{C}(W\cap\mathbb{C})$ is obvious since a complex element in the center of $W$ must be in the center of $W\cap\mathbb{C}$ .

For the converse inclusion, starting with $c\in\mathscr{C}(W\cap\mathbb{C})$ , we will prove that $y=\alpha c+(1-\alpha)\omega\in W$ for any $\alpha\in[0,1]$ and $\omega\in W$ .

We can assume, without loss of generality, that $c=c_{(i)}\in\mathbb{C}^{+}$ . Since any quaternion $y$ can be written as the sum of a real with a pure quaternion, we may write $y=y_{r}+|y_{v}|q$ , with $q\in\mathbb{S}_{\mathbb{P}}$ . We have:

[TABLE]

By similarity, it is enough to prove that $y_{(i)}\in B^{+}$ . With this purpose, we will find two elements $a,b\in B^{+}$ such that

[TABLE]

In this case, by convexity of the upper bild, $y_{(i)}\in B^{+}$ since $y_{(i)}=\beta a+(1-\beta)b,$ for some $\beta\in[0,1]$ . Let

[TABLE]

The conclusion that $b\in B^{+}$ follows from the fact that $w_{(i)},c_{(i)}\in B^{+}$ , which is a convex set.

If $\alpha|c_{v}|-(1-\alpha)|w_{v}|>0$ we take $a=\alpha c_{(i)}+(1-\alpha)w_{(i)}^{*}$ , else we take $a=\alpha c_{(i)}^{*}+(1-\alpha)w_{(i)}$ (clearly, $a\in\mathbb{C}^{+}$ ).

We now need to check that $a$ and $b$ are in $W$ and satisfy conditions (3.3). It is trivial to conclude that the real parts are all equal. On the other hand,

[TABLE]

To conclude that $|a_{v}|\leq|y_{v}|$ we will use Cauchy-Schwartz inequality. If we look a quaternion $q\in\mathbb{H}$ as a vector in $\mathbb{R}^{4}$ , its norm is given by $\langle q,q\rangle=|q|^{2}$ , where $\langle.,.\rangle$ is the usual inner product in real vector spaces. Then we have:

[TABLE]

Since $|c_{v}|=|c_{(i),v}|$ and $|w_{v}|=|w_{(i),v}|$ , we have:

[TABLE]

Using the equality $(\alpha c_{(i)}+(1-\alpha)w_{(i)}^{*}\Big{)}_{v}=\alpha|c_{(i),v}|i-(1-\alpha)|w_{(i),v}|i$ , it follows that

[TABLE]

Therefore, $|a_{v}|\leq|y_{v}|\leq|b_{v}|.$

It remains to prove that $a\in W$ . If $a=\alpha c_{(i)}+(1-\alpha)w_{(i)}^{*}$ , by hypothesis $c_{(i)}\in\mathscr{C}(W\cap\mathbb{C})$ and $w_{(i)}^{*}\in W\cap\mathbb{C}$ , then any convex combination of them is also in $W\cap\mathbb{C}$ . If $a=\alpha c_{(i)}^{*}+(1-\alpha)w_{(i)}$ then $a\in W$ , because $c_{(i)}^{*}\in\mathscr{C}(W\cap\mathbb{C})$ by lemma 3.6, and $w_{(i)}\in W\cap\mathbb{C}$ . ∎

Next result establish the relation between the center of the numerical range $\mathscr{C}(W)$ and the center of the bild $\mathscr{C}(W\cap\mathbb{C})$ .

Theorem 3.8.

The center of the numerical range is such that

[TABLE]

Proof.

Let $c\in\mathscr{C}(W)$ . For some $q\in\mathbb{S}_{\mathbb{P}}$ , we have $c\in\mathscr{C}(W)\cap\mathrm{span}\,\{1,q\}$ . Using a similar reasoning of the proof of proposition 3.7, we can show that

[TABLE]

Now, $c\in\mathscr{C}(W)$ if and only if $c\in\mathscr{C}(W^{(q)})$ , for some $q\in\mathbb{S}_{\mathbb{P}}$ , that is,

[TABLE]

By lemma 3.3, $\mathscr{C}(W^{(q)})\dot{\sim}\mathscr{C}(W^{(i)})$ . We conclude that

[TABLE]

∎

If we use the fact that $W$ is the set of all elements similar to those in $W\cap\mathbb{C}$ , that is, $W=\Big{[}W\cap\mathbb{C}\Big{]}$ , the above result can be written in the following way:

[TABLE]

In other words, the operations of taking the center and of taking the equivalence classes of a numerical range commute.

4. Characterization of the center of the bild

We now know that it is possible to characterize the center of the numerical range from the center of the bild. On the other hand, lemma 3.6 guarantees that the lower part of the center of the bild is the conjugate of the upper part,

[TABLE]

and we conclude that to determine $\mathscr{C}(W)$ we only need to know $\mathscr{C}^{+}$ . From corollary 3.5, we may focus only on non-hermitian matrices.

By the convexity of the upper bild, the segment joining any two elements in the upper bild is contained in it. Therefore an element of the upper bild is not in the center if and only if a convex combination with an element in the lower bild is not in the bild. That is, an element $\boldsymbol{\omega}\in W\cap\mathbb{C}^{+}$ is not in the center of the bild, $\boldsymbol{\omega}\not\in\mathscr{C}(W\cap\mathbb{C})$ , if and only if, there is $\boldsymbol{z}\in W\cap\mathbb{C}^{-}$ such that the segment connecting the two is not contained in the bild, i.e. $[\boldsymbol{\omega},\boldsymbol{z}]\not\subseteq W\cap\mathbb{C}$ . The argument we will use is build upon the fact that a segment, joining two elements of the bild, is not totally contained in the bild, if and only if it crosses the reals outside of it. Thus, either an element $\boldsymbol{\omega}$ of the upper bild has all its segments $[\boldsymbol{\omega},\boldsymbol{z}]$ , for $\boldsymbol{z}\in W\cap\mathbb{C}^{-}$ , crossing the real line inside the bild, that is, $[\boldsymbol{\omega},\boldsymbol{z}]\cap\mathbb{R}\subseteq B$ , in which case $\boldsymbol{\omega}$ is in the center, or there is one of these segments that crosses the real line outside the bild, and the element $\boldsymbol{\omega}$ is not in the center.

For the rest of this section we will slightly change notation and write $z=x+iy$ as $(x,y)$ . Let $m=\min W\cap\mathbb{R}$ and $M=\max W\cap\mathbb{R}$ be the minimum and maximum of the real elements in the bild. Using the previous reasoning, but on a dual perspective, to find out if an element $\boldsymbol{\omega}$ in the upper bild is in the center, we only need to see if the segments joining $\boldsymbol{\omega}$ to $(M,0)$ and to $(m,0)$ intersects the interior of the lower bild or not. In the case where it does the element is not in the center. For instance, if the segment joining $\boldsymbol{\omega}\in B^{+}$ to $(m,0)$ intersects $B^{-}$ at $\boldsymbol{z}$ in the lower part of the interior of the bild, then there is an element $\boldsymbol{\tilde{z}}$ to the left of $\boldsymbol{z}$ such that the segment $[\boldsymbol{\omega},\boldsymbol{\tilde{z}}]$ will cross the reals to the left of $(m,0)$ , and therefore outside of the bild.

The next results formalize this intuitive argument. To reach this we will need to define for each $\boldsymbol{\omega}\in\mathbb{C}^{+}$ two lines, one denoted $l_{\boldsymbol{\omega}}$ connecting $\boldsymbol{\omega}=(\omega_{1},\omega_{2})$ to $(m,0)$ , and the other denoted $L_{\boldsymbol{\omega}}$ connecting $\boldsymbol{\omega}$ to $(M,0)$ . Since the real points of the numerical range belongs to the center (see theorem 3.4), it is enough to consider points $\boldsymbol{\omega}=(\omega_{1},\omega_{2})$ , with $\omega_{2}>0$ . The lines are given by

[TABLE]

with $a=\dfrac{\omega_{1}-m}{\omega_{2}}$ and $b=\dfrac{\omega_{1}-M}{\omega_{2}}$ .

Let $y_{m}=\min\{\pi_{\text{Span}\{i\}}(B)\}$ and $y_{M}=\max\{\pi_{\text{Span}\{i\}}(B)\}$ . By symmetry of the bild, $y_{M}=-y_{m}$ . Since the matrix is non-hermitian, $y_{M}>0$ .

We may define, for $y\in[y_{m},0]$ , two functions:

[TABLE]

Notice that $x_{1}(0)=m$ and $x_{2}(0)=M$ . According to [Roc, theorem 5.3], $x_{1}(\cdot)$ is convex and $x_{2}(\cdot)$ is concave. The lower bild may be written using $x_{1}(\cdot)$ and $x_{2}(\cdot)$ :

[TABLE]

The interior of the lower bild is given by:

[TABLE]

The next result gives a characterization of $\mathscr{C}(B)$ , when $m<M$ . For $\boldsymbol{\omega}\in B^{+}$ the lines $l_{\boldsymbol{\omega}}$ and $L_{\boldsymbol{\omega}}$ do not cross over the interior of the lower bild, if and only if, $\boldsymbol{\omega}\in\mathscr{C}(B)$ .

Theorem 4.1.

Let $m<M$ and let $\boldsymbol{\omega}\in B^{+}$ . Then, $\boldsymbol{\omega}\in\mathscr{C}(B)$ if, and only if,

[TABLE]

Proof.

We begin by observing the following. Let $\boldsymbol{\omega}=(\omega_{1},\omega_{2})\in B^{+}$ with $\omega_{2}>0$ . The line $l_{\boldsymbol{\omega}}$ passing through $\boldsymbol{\omega}$ and $(m,0)$ can be written as:

[TABLE]

and define two half planes:

[TABLE]

To prove that if $\boldsymbol{\omega}\in\mathscr{C}(B)$ , then $(l_{\boldsymbol{\omega}}\cup L_{\boldsymbol{\omega}})\cap{\kern 0.0pt(B^{-})}^{\mathrm{o}}=\emptyset$ we proceed by contrapositive.

Fix an element $\boldsymbol{\omega}=(\omega_{1},\omega_{2})\in B^{+}$ as before, i.e., with $\omega_{2}\neq 0$ , and suppose there is an element $\boldsymbol{z}\in l_{\boldsymbol{\omega}}\cap{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ (if $\boldsymbol{z}\in L_{\boldsymbol{\omega}}\cap{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ the proof is analogous). Since $\boldsymbol{z}=(z_{1},z_{2})\in l_{\boldsymbol{\omega}}$ , the line $l_{\boldsymbol{\omega}}$ may also be written as

[TABLE]

Let $N_{\varepsilon}(\boldsymbol{z})\subset{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ be a neighborhood of $\boldsymbol{z}$ . Then, there is $\boldsymbol{\tilde{z}}=({\tilde{z}}_{1},{\tilde{z}}_{2})\in N_{\varepsilon}(\boldsymbol{z})$ such that $\tilde{z_{1}}<z_{1}$ and $\tilde{z_{2}}=z_{2}$ .

The line $\tilde{l}$ passing through $\boldsymbol{\omega}$ and $\boldsymbol{\tilde{z}}$ is

[TABLE]

Define the affine function $h(.)$ by:

[TABLE]

Clearly, $h(\omega_{2})=0$ and $h(z_{2})=\tilde{z}_{1}-z_{1}<0$ . Since $\omega_{2}>0$ and $z_{2}<0$ , there is $\beta\in(0,1)$ such that $0=\beta\omega_{2}+(1-\beta)z_{2}$ . Moreover, since $h$ is affine,

[TABLE]

and so, $g(0)<f(0)=l_{\boldsymbol{\omega}}(0)=m$ . Hence, the line passing through $\boldsymbol{\omega}$ and $\tilde{\boldsymbol{z}}$ does not intersect $B\cap\mathbb{R}$ , which implies that $[\boldsymbol{w},\tilde{\boldsymbol{z}}]\nsubseteq B$ and $\boldsymbol{\omega}\notin\mathscr{C}(B)$ .

Now we prove the converse, that is, for $\boldsymbol{\omega}\in B^{+}$ if $\boldsymbol{\omega}\notin\mathscr{C}(B)$ then $(l_{\boldsymbol{\omega}}\cup L_{\boldsymbol{\omega}})\cap{\kern 0.0pt(B^{-})}^{\mathrm{o}}\neq\emptyset.$ Since $\boldsymbol{\omega}\notin\mathscr{C}(B)$ , there is a point $\boldsymbol{z}=(z_{1},z_{2})\in{B}^{-}$ such that the line segment $[\boldsymbol{\omega},\boldsymbol{z}]$ is not contained in the bild.

Assume that the line containing $[\boldsymbol{\omega},\boldsymbol{z}]$ , call it $x=g(y)$ , intersects the real line ( $y=0$ ) at $(\nu,0)$ . Since $[\boldsymbol{\omega},\boldsymbol{z}]\nsubseteq B$ then $\nu\notin[m,M]$ . Otherwise, if $(\nu,0)\in B$ , convexity of the upper bild implies that $[\boldsymbol{\omega},(\nu,0)]\subseteq B^{+}$ and convexity of the lower bild implies that $[(\nu,0),\boldsymbol{z}]\subseteq B^{-}$ . Thus,

[TABLE]

which contradicts our hypothesis. We will assume that $\nu<m$ (when $\nu>M$ the proof is analogous). We claim that we can take $\boldsymbol{z}=(z_{1},z_{2})\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ . In fact, if $(z_{1},z_{2})$ is on the boundary of $B$ , take $\boldsymbol{z}_{\boldsymbol{\epsilon}}=(z_{1}+\epsilon_{1},z_{2}+\epsilon_{2})$ , with $\boldsymbol{\epsilon}=(\epsilon_{1},\epsilon_{2})$ small enough such that $\boldsymbol{z}_{\boldsymbol{\epsilon}}\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ and $[\boldsymbol{\omega},\boldsymbol{z}_{\boldsymbol{\epsilon}}]\cap\mathbb{R}=\nu^{\prime}$ , close enough to $\nu$ in order to satisfy $\nu^{\prime}<m$ . In this way, there is a point $\boldsymbol{z}_{\boldsymbol{\epsilon}}\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ such that $[\boldsymbol{\omega},\boldsymbol{z}_{\boldsymbol{\epsilon}}]\nsubseteq B$ .

Since $\nu<l_{\boldsymbol{\omega}}(0)=m<M$ then $(\nu,0)$ and $(M,0)$ must be in different half-planes, that is, $(\nu,0)\in\wp^{-}$ and $(M,0)\in\wp^{+}$ .

We now show that $\boldsymbol{z}$ and $(M,0)$ are in different half-planes using the same reasoning of the first part of the proof, but now with $g$ being the line that passes through the points $\boldsymbol{\omega}$ , $(\nu,0)$ and $\boldsymbol{z}$ , and $f$ being the line that contains $[\boldsymbol{\omega},(m,0)]$ . It follows that $h(0)=g(0)-f(0)=\nu-m<0$ . Since $\omega_{2}>0$ and $z_{2}<0$ , there is $\beta\in(0,1)$ such that $0=\beta\omega_{2}+(1-\beta)z_{2}$ . Hence, $h(0)=(1-\beta)h(z_{2})<0$ and $h(z_{2})<0$ . It follows that $z_{1}<l_{\boldsymbol{\omega}}(z_{2})$ and therefore $\boldsymbol{z}\in\wp^{-}.$

Let $\gamma(x,y)=x-l_{\boldsymbol{\omega}}(y)$ . Since $\gamma(\boldsymbol{z})<0$ and $\gamma(M,0)>0$ , then the line that joins $\boldsymbol{z}$ to $(M,0)$ , by continuity of $\gamma$ , passes through a point $\boldsymbol{z}^{\prime}$ with $\gamma(\boldsymbol{z}^{\prime})=0$ , that is, $[\boldsymbol{z},(M,0)]\cap l_{\boldsymbol{\omega}}=\boldsymbol{z}^{\prime}$ . Taking into account $\boldsymbol{z}^{\prime}\in l_{\boldsymbol{\omega}}$ , it only remains to prove that $\boldsymbol{z}^{\prime}\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ . Since $\boldsymbol{z}=(z_{1},z_{2}),(M,0)\in B^{-}$ , by convexity of $B^{-}$ we have

[TABLE]

for some $\alpha\in(0,1)$ . Note that if $\alpha=0$ , $\boldsymbol{z}^{\prime}=\boldsymbol{z}$ and if $\alpha=1$ , $\boldsymbol{z}^{\prime}=(M,0)$ , that cannot happen because $\boldsymbol{z}^{\prime}\in l_{\boldsymbol{\omega}}$ and $\boldsymbol{z},(M,0)\notin l_{\boldsymbol{\omega}}$ . We know that $\boldsymbol{z}\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ , and so,

[TABLE]

From (4.4) and since $M>x_{1}(0)=m$ , we have:

[TABLE]

With a similar reasoning and using that $\alpha\in(0,1)$ we see that $z^{\prime}_{1}<x_{2}(z^{\prime}_{2})$ . It follows that

[TABLE]

Now we need to check that $y_{m}<z^{\prime}_{2}<0$ . Since $\boldsymbol{z}\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ we know that $y_{m}<z_{2}<0$ , and so,

[TABLE]

We conclude that $\boldsymbol{z}^{\prime}\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ . ∎

Relying on our previous results, we will now prove the existence of two lines containing $(m,0)$ and $(M,0)$ that define the upper boundary of the center. Such lines are denoted respectively by $l$ and $L$ .

Any concave function has lateral derivatives [Roc, theorem 23.1], therefore let $a=x^{\prime}_{1}(0^{-})$ and $b=x^{\prime}_{2}(0^{-})$ , the left derivative at [math] of $x_{1}(\cdot)$ and $x_{2}(\cdot)$ , respectively.

Let the left tangent line to $x_{1}$ and $x_{2}$ at [math] be given by the sets

[TABLE]

respectively. Since $x_{1}(\cdot)$ is convex and $x_{2}(\cdot)$ is concave we have, [Roc, theorem 25.1], $l(y)\leq x_{1}(y)$ and $x_{2}(y)\leq L(y)$ , for every $y\in[y_{m},0]$ .

Proposition 4.2.

Let $m<M$ and let $\boldsymbol{\omega}=(\omega_{1},\omega_{2})\in B^{+}$ . Then,

(i)

$l(\omega_{2})\leq\omega_{1}$ * if, and only if, $l_{\boldsymbol{\omega}}\cap{\kern 0.0pt(B^{-})}^{\mathrm{o}}=\emptyset$ ,* 2. (ii)

$\omega_{1}\leq L(\omega_{2})$ * if, and only if, $L_{\boldsymbol{\omega}}\cap{\kern 0.0pt(B^{-})}^{\mathrm{o}}=\emptyset$ .*

Proof.

We will prove (i). A similar reasoning proves (ii). Let $\boldsymbol{\omega}\in B^{+}$ with $\omega_{2}>0$ and $l_{\boldsymbol{\omega}}$ be the line passing through $\boldsymbol{\omega}$ and $(m,0)$ . We can write $l_{\boldsymbol{\omega}}(y)=\tilde{a}y+m$ , with $\tilde{a}=\dfrac{\omega_{1}-m}{\omega_{2}}$ .

Now we will prove that if $\omega_{1}\geq l(\omega_{2})$ the line $l_{\boldsymbol{\omega}}$ does not intersect ${\kern 0.0pt(B^{-})}^{\mathrm{o}}$ . Since $\omega_{1}=l_{\boldsymbol{\omega}}(\omega_{2})\geq l(\omega_{2})$ , it is clear that $\tilde{a}\omega_{2}+m\geq a\omega_{2}+m$ , i.e., $(\tilde{a}-a)\omega_{2}\geq 0$ . Since $\omega_{2}>0$ we have $\tilde{a}\geq a$ . For $y\geq 0$ , $(l_{\boldsymbol{\omega}}(y),y)\in\mathbb{C}^{+}$ and so

[TABLE]

For $y<0$ we have $l_{\boldsymbol{\omega}}(y)\leq l(y)$ , since $\tilde{a}\geq a$ . From the convexity of $x_{1}(\cdot)$ and using [Roc, theorem 25.1] we have $x_{1}(y)\geq l(y)$ for any $y\in[y_{m},0]$ . Then, $l_{\boldsymbol{\omega}}(y)\leq x_{1}(y).$ Therefore, $(l_{\boldsymbol{\omega}}(y),y)\in l_{\boldsymbol{\omega}}$ with $l_{\boldsymbol{\omega}}(y)\leq x_{1}(y)$ and from (4.3) we see that $(l_{\boldsymbol{\omega}}(y),y)\notin{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ .

To prove the converse, we want to show that if $l(\omega_{2})>\omega_{1}$ then $l_{\boldsymbol{\omega}}\cap{\kern 0.0pt(B^{-})}^{\mathrm{o}}\neq\emptyset$ , that is, the line $l_{\boldsymbol{\omega}}\supseteq[(\omega_{1},\omega_{2}),(m,0)]=[\boldsymbol{\omega},(m,0)]$ intersects ${\kern 0.0pt(B^{-})}^{\mathrm{o}}$ . Again, we have $\omega_{1}=l_{\boldsymbol{\omega}}(\omega_{2})<l(\omega_{2})$ , and therefore $(\tilde{a}-a)\omega_{2}<0$ . Since $\omega_{2}>0$ , necessarily $\tilde{a}<a$ . Define, for $y\in[y_{m},0]$ ,

[TABLE]

By the first order Taylor’s approximation of $x_{1}(\cdot)$ , for small $\epsilon>0$ we get

[TABLE]

Since $\tilde{a}<a$ , it follows that $h(-\epsilon)>0$ , for small $\epsilon>0$ . In other words, $l_{\boldsymbol{\omega}}(-\epsilon)>x_{1}(-\epsilon)$ for $\epsilon$ small enough. Taking into account that $l_{\boldsymbol{\omega}}(0)=m<x_{2}(0)=M$ and that $l_{\boldsymbol{\omega}}$ and $x_{2}(\cdot)$ are continuous [Roc, corollary 10.1.1], for $\epsilon$ small enough we have $l_{\boldsymbol{\omega}}(-\epsilon)<x_{2}(-\epsilon)$ . Therefore, we can choose an $\epsilon>0$ such that $x_{1}(-\epsilon)<l_{\boldsymbol{\omega}}(-\epsilon)<x_{2}(-\epsilon)$ and $y_{m}<-\epsilon<0$ . Then $(l_{\boldsymbol{\omega}}(-\epsilon),-\epsilon)\in{\kern 0.0pt(B^{-})}^{\mathrm{o}}$ . ∎

We can now present a general way to determine the center. Let $\pi_{m}\equiv\mathrm{min}\,\pi_{\mathbb{R}}(W)$ and $\pi_{M}\equiv\mathrm{max}\,\pi_{\mathbb{R}}(W)$ .

Theorem 4.3.

Let $\boldsymbol{\omega}=(\omega_{1},\omega_{2})\in B^{+}$ . Then, $l(\omega_{2})\leq\omega_{1}\leq L(\omega_{2})$ if, and only if, $\boldsymbol{\omega}\in\mathscr{C}(B)$ .

Proof.

When $m<M$ , proposition 4.2 and theorem 4.1 prove the stated equivalence. For the case $m=M$ we will first find out the $\mathscr{C}(B)$ and then prove the equality with the set $\{(\omega_{1},\omega_{2})\in B^{+}:l(\omega_{2})\leq\omega_{1}\leq L(\omega_{2})\}$ .

When $m=M$ and the bild is a vertical segment $B=\{m\}\times[y_{m},y_{M}]$ then, clearly, $\mathscr{C}(B)=B$ and, in this case, $\mathscr{C}^{+}(B)=B^{+}=\{m\}\times[0,y_{M}]$ . If $m=M$ but the bild is not a vertical line ( $\pi_{m}<\pi_{M}$ ) then we claim the center is $\mathscr{C}=\{(m,0)\}$ . To see this, first consider that $\boldsymbol{z}=(z_{1},z_{2})\in B$ with $z_{1}\neq m$ . Then ${\boldsymbol{z}}^{*}=(z_{1},-z_{2})\in B$ and $\tfrac{1}{2}\boldsymbol{z}+\tfrac{1}{2}{\boldsymbol{z}}^{*}=(z_{1},0)\notin B$ . Therefore $\boldsymbol{z}\notin\mathscr{C}(B)$ . It remains to consider the case where $\boldsymbol{\omega}=(m,y)\in B$ , for some $y\neq 0$ . There is $(z_{1},z_{2}),(z_{1},-z_{2})\in B$ with $z_{1}\neq m$ and $z_{2}\neq 0$ . Assume, without loss of generality that $z_{2}$ has opposite sign of $y$ . Then there is a $\beta\in(0,1)$ , such that $\beta y+(1-\beta)z_{2}=0$ . Clearly, $m\neq\beta m+(1-\beta)z_{1}\not\in B\cap\mathbb{R}$ , thus

[TABLE]

We concluded that $\beta(m,y)+(1-\beta)(z_{1},z_{2})\not\in B$ and therefore that $(m,y)\notin\mathscr{C}(B)$ , for $y\neq 0$ .

In the case where $B=\{m\}\times[y_{m},y_{M}]$ , $x_{1}(y)=m=x_{2}(y)$ for $y\in[y_{m},0]$ and $x_{1}^{\prime}(0^{-})=x_{2}^{\prime}(0^{-})=0$ , thus $l(y)=L(y)=m$ for any $y\in\mathbb{R}$ . Then

[TABLE]

When $m=M$ and $\pi_{m}<\pi_{M}$ , we know that $x_{1}(\cdot)\leq x_{2}(\cdot)$ and $x_{1}(0)=x_{2}(0)=m$ . Then $a\equiv x_{1}^{\prime}(0^{-})\geq x_{2}^{\prime}(0^{-})\equiv b$ . In the case where $a>b$ we have

[TABLE]

Therefore, $\{(x,y)\in B^{+}:l(y)\leq x\leq L(y)\}=(m,0)$ and this is, in fact, the upper center of $B$ .

We now consider $a=b\neq 0$ (the case where $a=b=0$ is the one where $B=\{m\}\times[y_{m},y_{M}]$ ). Since $x_{1}(\cdot)$ is convex and $x_{2}(\cdot)$ is concave we know that, using again [Roc, theorem 25.1], $l(y)\leq x_{1}(y)\leq x_{2}(y)\leq L(y)$ . As a consequence of $a=b$ we have that $l=L$ and thus

[TABLE]

that is, the lower bild is a line, and we can write it as the set

[TABLE]

Since the upper bild is the conjugate of the lower bild,

[TABLE]

Then the intersection of $B^{+}$ and $l=\{(x,y)\in\mathbb{R}^{2}:x=m+ay,y\in\mathbb{R}\}$ , when $a\neq 0$ is just $(m,0)$ . That is

[TABLE]

∎

A simple observation on the slope of the lines $l$ and $L$ allows us to give a different proof of the known result of Au-Yeung (see, [AY1, theorem 3]), which establishes an equivalent condition for the convexity of the quaternionic numerical range.

It is well known [Roc, theorem 23.1] that for any convex function $f$ of real variable and any fixed element $y_{1}$ in the domain of $f$ the function defined by

[TABLE]

is increasing with $y$ . Then any line that joins $(f(y),y)$ and $(f(y_{1}),y_{1})$ in the graph of $f$ with $y<y_{1}$ has slope smaller than $f^{\prime}(y_{1}^{-})$ . Notice now that there is an element $(\pi_{m},y_{\pi_{m}})$ in the lower bild. Using the previous conclusion when the convex function is $x_{1}$ , the reference point is $y_{1}=0$ and $x_{1}(y_{\pi_{m}})=\pi_{m}<x_{1}(0)=m$ , we conclude that

[TABLE]

[Roc, theorem 23.1], that is, $l$ has positive slope. For the case when $\pi_{m}=m$ we have

[TABLE]

since $\epsilon<0$ and $m\leq x_{1}(\epsilon)$ . Thus $l$ has nonpositive slope.

Analogously, it can be shown that when $M<\pi_{M}$ , $L$ has negative slope and when $M=\pi_{M}$ , $L$ has nonnegative slope.

Corollary 4.4.

The numerical range is convex if and only if $\pi_{m}=m$ and $\pi_{M}=M$ .

Proof.

We begin by proving that if $\pi_{m}\neq m$ or $\pi_{M}\neq M$ , then the numerical range is non-convex. Suppose $\pi_{m}<m$ (the case $M<\pi_{M}$ is analogous). Let $x=l(y)=m+ay$ be the left tangent line to $x_{1}(\cdot)$ at [math] as in (4.5).

For $y>0$ , we have, by (4.6), $x=l(y)=m+ay\geq m$ . Notice that $(\pi_{m},-y_{\pi_{m}})\in B^{+}$ . Therefore, $l(-y_{\pi_{m}})=m+(-y_{\pi_{m}})a\geq l(0)=m>\pi_{m}$ . Hence, we have found $(\pi_{m},-y_{\pi_{m}})\in B^{+}$ such that $l(-y_{\pi_{m}})>\pi_{m}$ . From theorem 4.3 we have $(\pi_{m},-y_{\pi_{m}})\notin\mathscr{C}(B)$ and so $B$ is not convex since $\mathscr{C}(B)\neq B$ . By theorem 3.1 we conclude that $W$ is not convex.

Now we prove that if $\pi_{m}=m$ and $\pi_{M}=M$ then the numerical range is convex. Recall that $l(y)=ay+m$ , with $a\leq 0$ , see (4.7). For $y>0$ , we have $l(y)\leq m$ . For every $(x,y)\in B$ , we have $x\geq m\geq l(y)$ .

Analogously, we can show that $x\leq M\leq L(y)$ . From theorem 4.3 we have that $(x,y)\in\mathscr{C}(B)$ . Since $(x,y)$ is arbitrary, we have that $\mathscr{C}(B)=B$ is convex and from theorem 3.1, $W$ is convex. ∎

An interesting case, where the center is a kite, is when $\pi_{m}<m$ and $M<\pi_{M}$ . The next corollary proves this result.

Corollary 4.5.

Let $\pi_{m}<m\leq M<\pi_{M}$ . Suppose there is $\boldsymbol{\tilde{\omega}}\in\mathbb{C}^{+}$ such that $l\cap L=\{\boldsymbol{\tilde{\omega}}\}$ . Then,

[TABLE]

Proof.

When $\pi_{m}<m$ , as we have noticed in (4.6), $l$ has positive slope. Similarly, we can show that $L$ has negative slope. Since $l$ passes through $(m,0)$ and $L$ through $(M,0)$ , $l$ and $L$ must cross at a point in $\mathbb{C}^{+}$ . Let this point be $\boldsymbol{\tilde{\omega}}$ . The result follows from theorem 4.3. ∎

The follow example illustrates a case where the center is a kite.

*Example**.*

Following [ST, page 318], let $A=\left[\begin{array}[]{cc}k_{1}i&\alpha\\ -\alpha&1+k_{2}i\end{array}\right],$ with $\alpha,k_{1},k_{2}\in\mathbb{R}^{+}$ and $\alpha^{2}>k_{1}k_{2}$ . In this case, the boundary of the lower bild $B^{-}$ consists of an ellipse $\mathscr{E}$ and the segment $[m,M]\times\{0\}$ , where $(m,0)$ and $(M,0)$ are the points where $\mathscr{E}$ intersects the real axis (the notation in [ST] is $m=T_{1}$ and $M=T_{2}$ ). Our aim is to describe the center of the bild of $A$ .

From [ST, lemma 6.4], case $5$ , the ellipse $\mathscr{E}$ contains the points $(0,-k_{1})$ , $(1,-k_{2})$ , $(m,0)$ and $(M,0)$ , where

[TABLE]

Moreover, we know that the vertical lines $x=0$ and $x=1$ are tangent to the ellipse at $(0,-k_{1})$ and $(1,-k_{2})$ , respectively. These data fully characterize the ellipse $\mathscr{E}$ . Therefore, if we substitute those points in the general equation

[TABLE]

we obtain a homogeneous system of six linear equations with six unknowns. From formulas (4.8) one concludes that the linear system’s matrix has rank $5$ . Solving the linear system leads to the following characterization of $\mathscr{E}$ :

[TABLE]

Taking the derivative $\dfrac{d}{dy}$ in (4.9) with $x=x(y)$ (recall that the left derivatives $x_{1}^{\prime}(0^{-})$ and $x_{2}^{\prime}(0^{-})$ exist), we get

[TABLE]

It is now possible, albeit a tedious computation, to define the lines $l$ and $L$ as in theorem 4.3 and characterize $\mathscr{C}(B)$ .

Let us consider a more specific example. Take $A=\left[\begin{array}[]{cc}\frac{1}{8}i&\frac{1}{4}\\ -\frac{1}{4}&1+\frac{1}{8}i\end{array}\right],$ i.e. $\alpha=\dfrac{1}{4}$ and $k_{1}=k_{2}=\dfrac{1}{8}$ . Then, the ellipse $\mathscr{E}$ becomes

[TABLE]

or, in the reduced form,

[TABLE]

We have:

[TABLE]

The lines $l$ and $L$ are given by

[TABLE]

They intersect at $\Big{(}\dfrac{1}{2},\dfrac{3}{8}\Big{)}$ , a point on the boundary of the bild of $A$ .

We conclude that the center of the bild of $A=\left[\begin{array}[]{cc}\frac{1}{8}i&\frac{1}{4}\\ -\frac{1}{4}&1+\frac{1}{8}i\end{array}\right]$ is:

[TABLE]

Bibliography19

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AY 1] Y. Au-Yeung, On the convexity of the numerical range in quaternionic Hilbert space , Linear and Multilinear Algebra, 16 (1984), 93–100.
2[AY 2] Y. Au-Yeung, A short proof of a theorem on the numerical range of a normal quaternionic matrix , Linear and Multilinear Algebra, 39:3 (1995), 279–284.
3[AYS] Y. Au-Yeung, L. Siu, Quaternionic numerical range and real subspaces , Linear and Multilinear Algebra, 45 (1999), 317–327.
4[CT] W. Cheung, N.-K. Tsing, The C 𝐶 C -numerical range of matrices is star-shaped , Linear and Multilinear Algebra, 41:3 (1996), 245–250.
5[GR] K. Gustafson, D. Rao, Numerical Range , Springer-Verlag, New York, 1997.
6[J] J. Jamison, Numerical range and numerical radius in quaternionic Hilbert space , Doctoral Dissertation, Univ. of Missouri, 1972.
7[Ki] R. Kippenhahn, On the numerical range of a matrix , Translated from the German by Paul F. Zachlin and Michiel E. Hochstenbach. Linear Multilinear Algebra, 56:1-2 (2008), 185-225.
8[K] P. Kumar, A note on convexity of sections of quaternionic numerical range , Linear Algebra and its Applications, 572 (2019), 92-116.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

The star-center of the quaternionic numerical range

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

2. Preliminaries

Definition 2.1**.**

3. Star-shapedness of the bild and numerical range

Theorem 3.1**.**

Proof.

Proposition 3.2**.**

Lemma 3.3**.**

Proof.

Theorem 3.4**.**

Proof.

Corollary 3.5**.**

Lemma 3.6**.**

Proof.

Proposition 3.7**.**

Proof.

Theorem 3.8**.**

Proof.

4. Characterization of the center of the bild

Theorem 4.1**.**

Proof.

Proposition 4.2**.**

Proof.

Theorem 4.3**.**

Proof.

Corollary 4.4**.**

Proof.

Corollary 4.5**.**

Proof.

Example*.*

Definition 2.1.

Theorem 3.1.

Proposition 3.2.

Lemma 3.3.

Theorem 3.4.

Corollary 3.5.

Lemma 3.6.

Proposition 3.7.

Theorem 3.8.

Theorem 4.1.

Proposition 4.2.

Theorem 4.3.

Corollary 4.4.

Corollary 4.5.

*Example**.*