Efficient Reduction of Compressed Unitary plus Low-rank Matrices to   Hessenberg form

Roberto Bevilacqua; Gianna M. Del Corso; Luca Gemignani

arXiv:1901.08411·math.NA·August 30, 2019

Efficient Reduction of Compressed Unitary plus Low-rank Matrices to Hessenberg form

Roberto Bevilacqua, Gianna M. Del Corso, Luca Gemignani

PDF

TL;DR

This paper introduces efficient numerical methods for reducing a compressed unitary plus low-rank matrix to Hessenberg form, enabling faster eigenvalue computations by exploiting structured decompositions and bulge chasing techniques.

Contribution

It develops a novel structured decomposition called LFR for such matrices and provides a fast reduction algorithm with $O(n^2 k)$ complexity, improving eigenvalue computation efficiency.

Findings

01

Reduction cost is $O(n^2 k)$ arithmetic operations.

02

LFR decomposition enables efficient Hessenberg reduction.

03

Eigenvalues can be computed using a fast QR algorithm after reduction.

Abstract

We present fast numerical methods for computing the Hessenberg reduction of a unitary plus low-rank matrix $A = G + U V^{H}$ , where $G \in C^{n \times n}$ is a unitary matrix represented in some compressed format using $O (nk)$ parameters and $U$ and $V$ are $n \times k$ matrices with $k < n$ . At the core of these methods is a certain structured decomposition, referred to as a LFR decomposition, of $A$ as product of three possibly perturbed unitary $k$ Hessenberg matrices of size $n$ . It is shown that in most interesting cases an initial LFR decomposition of $A$ can be computed very cheaply. Then we prove structural properties of LFR decompositions by giving conditions under which the LFR decomposition of $A$ implies its Hessenberg shape. Finally, we describe a bulge chasing scheme for converting the initial LFR decomposition of $A$ into the LFR decomposition of a Hessenberg matrix by…

Tables4

Table 1. Table 1: Backward errors for random matrices with k = 2 𝑘 2 k=2

n	${‖ A ‖}_{2}$	$ϵ_{P}$	$ϵ_{B}$	$ϵ_{H}$
32	8.2e+01	2.2e-17	3.9e-17	4.3e-19
64	1.4e+02	1.5e-17	5.2e-17	5.1e-19
128	2.7e+02	7.7e-18	6.0e-17	2.0e-19
256	5.2e+02	5.5e-18	1.3e-16	1.4e-19
512	1.0e+03	3.2e-18	2.2e-16	1.4e-19

Table 2. Table 2: Backward errors for random matrices of large norm with k = 2 𝑘 2 k=2

n	${‖ A ‖}_{2}$	$ϵ_{P}$	$ϵ_{B}$	$ϵ_{H}$
32	7.6e+04	1.2e-17	4.9e-17	7.0e-22
64	6.0e+05	1.3e-17	5.7e-17	2.1e-22
128	4.5e+06	5.5e-18	7.6e-17	6.6e-24
256	3.6e+07	6.6e-18	1.3e-16	1.5e-24
512	2.7e+08	2.3e-18	2.2e-16	2.6e-25

Table 3. Table 3: Backward errors for random matrices with k = 4 𝑘 4 k=4

n	${‖ A ‖}_{2}$	$ϵ_{P}$	$ϵ_{B}$	$ϵ_{H}$
64	1.5e+02	7.4e-18	4.4e-17	2.3e-18
128	2.9e+02	3.2e-18	5.6e-17	1.2e-18
256	5.5e+02	2.5e-18	9.6e-17	4.1e-19
512	1.1e+03	1.8e-18	1.6e-16	5.0e-19

Table 4. Table 4: Backward errors for random matrices of large norm with k = 4 𝑘 4 k=4

n	${‖ A ‖}_{2}$	$ϵ_{P}$	$ϵ_{B}$	$ϵ_{H}$
64	6.2e+05	6.6e-18	5.2e-17	5.3e-18
128	4.9e+06	4.5e-18	6.8e-17	1.8e-18
256	3.8e+07	2.2e-18	9.2e-17	5.5e-19
512	2.9e+08	2.3e-18	1.6e-16	8.0e-19

Equations89

G = \times R_{1} \times \times \times R_{2} L_{3} \times \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱

G = \times R_{1} \times \times \times R_{2} L_{3} \times \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱

G = \times \times R_{1} L_{2} \times \times \times \times \times R_{3} L_{4} \times \times \times \times \times L_{6} \times ⋱ ⋱ ⋱

G = \times \times R_{1} L_{2} \times \times \times \times \times R_{3} L_{4} \times \times \times \times \times L_{6} \times ⋱ ⋱ ⋱

rank (U (α, β)) = rank (U (J \ α, J \ β)) + ∣ α ∣ + ∣ β ∣ - n

rank (U (α, β)) = rank (U (J \ α, J \ β)) + ∣ α ∣ + ∣ β ∣ - n

rank (U (1 : h, h + 1 : n)) = rank (U (h + 1 : n, 1 : h)), \mbox f or a l l h = 1, \dots, n - 1.

rank (U (1 : h, h + 1 : n)) = rank (U (h + 1 : n, 1 : h)), \mbox f or a l l h = 1, \dots, n - 1.

0 = rank (G (1 : 2 p k, (2 p + 1) k + 1 : n)) = rank (G (2 p k + 1 : n, 1 : (2 p + 1) k)) - k

0 = rank (G (1 : 2 p k, (2 p + 1) k + 1 : n)) = rank (G (2 p k + 1 : n, 1 : (2 p + 1) k)) - k

rank (G (2 p k + 1 : 2 (p + 1) k, (2 p - 1) k + 1 : (2 p + 1) k)) = k .

rank (G (2 p k + 1 : 2 (p + 1) k, (2 p - 1) k + 1 : (2 p + 1) k)) = k .

G=\begin{bmatrix}\times&\times&L_{3}\\ R_{1}&\times&\times\\ &\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}\hbox{\pagecolor{gray}$\times$}&\hbox{\pagecolor{gray}$\times$}&\times&L_{5}\\ &R_{2}&\hbox{\pagecolor{gray}$\times$}\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}&\times&\times\\ &&&\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}\hbox{\pagecolor{gray}$\times$}&\hbox{\pagecolor{gray}$\times$}\\ &&&R_{4}&\hbox{\pagecolor{gray}$\times$}\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}&\ddots\\ &&&&\ddots&\ddots\\ \end{bmatrix}\leavevmode\hbox to5.8pt{\vbox to13.8pt{\pgfpicture\makeatletter\hbox{\hskip 3.4pt\lower-4.40001pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{ { {}{}{}{}{}}{}{}{}{}{{}}{} { {}{}{}{}{}}{}{}{}{}{{}}{}{}{}{}{{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@setlinewidth{0.8pt}\pgfsys@invoke{ }\color[rgb]{0,1,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,1,0}\pgfsys@color@rgb@stroke{0}{1}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{1}{0}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{0,1,0}{}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@lineto{-3.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{9.00002pt}\pgfsys@closepath\pgfsys@moveto{2.0pt}{-4.00002pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}\leavevmode\hbox to5.8pt{\vbox to13.8pt{\pgfpicture\makeatletter\hbox{\hskip 3.4pt\lower-4.40001pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{ { {}{}{}{}{}}{}{}{}{}{{}}{} { {}{}{}{}{}}{}{}{}{}{{}}{}{}{}{}{{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@setlinewidth{0.8pt}\pgfsys@invoke{ }\color[rgb]{0,1,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,1,0}\pgfsys@color@rgb@stroke{0}{1}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{1}{0}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{0,1,0}{}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@lineto{-3.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{9.00002pt}\pgfsys@closepath\pgfsys@moveto{2.0pt}{-4.00002pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}

G=\begin{bmatrix}\times&\times&L_{3}\\ R_{1}&\times&\times\\ &\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}\hbox{\pagecolor{gray}$\times$}&\hbox{\pagecolor{gray}$\times$}&\times&L_{5}\\ &R_{2}&\hbox{\pagecolor{gray}$\times$}\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}&\times&\times\\ &&&\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}\hbox{\pagecolor{gray}$\times$}&\hbox{\pagecolor{gray}$\times$}\\ &&&R_{4}&\hbox{\pagecolor{gray}$\times$}\leavevmode\hbox to6.67pt{\vbox to6.67pt{\pgfpicture\makeatletter\hbox{\hskip 3.33301pt\lower-3.33301pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}\hbox{\hbox{{\pgfsys@beginscope\pgfsys@invoke{ }{{}{}{{ {}{}}}{ {}{}} {{}{{}}}{{}{}}{}{{}{}} { }{{{{}}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@transformcm{1.0}{0.0}{0.0}{1.0}{0.0pt}{0.0pt}\pgfsys@invoke{ }\hbox{{\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\hbox{} }}\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope}}} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}&\ddots\\ &&&&\ddots&\ddots\\ \end{bmatrix}\leavevmode\hbox to5.8pt{\vbox to13.8pt{\pgfpicture\makeatletter\hbox{\hskip 3.4pt\lower-4.40001pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{ { {}{}{}{}{}}{}{}{}{}{{}}{} { {}{}{}{}{}}{}{}{}{}{{}}{}{}{}{}{{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@setlinewidth{0.8pt}\pgfsys@invoke{ }\color[rgb]{0,1,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,1,0}\pgfsys@color@rgb@stroke{0}{1}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{1}{0}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{0,1,0}{}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@lineto{-3.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{9.00002pt}\pgfsys@closepath\pgfsys@moveto{2.0pt}{-4.00002pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}\leavevmode\hbox to5.8pt{\vbox to13.8pt{\pgfpicture\makeatletter\hbox{\hskip 3.4pt\lower-4.40001pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{ { {}{}{}{}{}}{}{}{}{}{{}}{} { {}{}{}{}{}}{}{}{}{}{{}}{}{}{}{}{{}}{}\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@setlinewidth{0.8pt}\pgfsys@invoke{ }\color[rgb]{0,1,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,1,0}\pgfsys@color@rgb@stroke{0}{1}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{1}{0}\pgfsys@invoke{ }\definecolor[named]{pgffillcolor}{rgb}{0,1,0}{}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@moveto{-3.0pt}{9.00002pt}\pgfsys@lineto{-3.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{-4.00002pt}\pgfsys@lineto{2.0pt}{9.00002pt}\pgfsys@closepath\pgfsys@moveto{2.0pt}{-4.00002pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope} \pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hbox to0.0pt{}{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{\lxSVG@closescope }\pgfsys@endscope\hss}}\lxSVG@closescope\endpgfpicture}}

G_{1} = diag (G_{1, 1}, \dots, G_{1, s}), G_{2} = diag (I_{k}, G_{2, 2}, \dots, G_{2, s + 1})

G_{1} = diag (G_{1, 1}, \dots, G_{1, s}), G_{2} = diag (I_{k}, G_{2, 2}, \dots, G_{2, s + 1})

Q_{1, 1} R_{2, 1} Q_{1, 2} Q_{2, 2} I^{H} \times R_{1} \times \times \times R_{2} L_{3} \times \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱ = \tilde{\times} \tilde{\times} \times R_{2} \tilde{\times} \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱

Q_{1, 1} R_{2, 1} Q_{1, 2} Q_{2, 2} I^{H} \times R_{1} \times \times \times R_{2} L_{3} \times \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱ = \tilde{\times} \tilde{\times} \times R_{2} \tilde{\times} \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱

Q_{1, 1} R_{2, 1} Q_{1, 2} Q_{2, 2} Q_{3, 3} R_{4, 3} Q_{3, 4} Q_{4, 4} I^{H} \times R_{1} \times \times \times R_{2} L_{3} \times \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱ \par = \tilde{\times} \tilde{\times} \tilde{\times} \tilde{\times} \tilde{\times} \tilde{\times} \times R_{4} \tilde{\times} \times \times ⋱ ⋱ ⋱ \par

Q_{1, 1} R_{2, 1} Q_{1, 2} Q_{2, 2} Q_{3, 3} R_{4, 3} Q_{3, 4} Q_{4, 4} I^{H} \times R_{1} \times \times \times R_{2} L_{3} \times \times \times \times \times \times R_{4} L_{5} \times \times \times ⋱ ⋱ ⋱ \par = \tilde{\times} \tilde{\times} \tilde{\times} \tilde{\times} \tilde{\times} \tilde{\times} \times R_{4} \tilde{\times} \times \times ⋱ ⋱ ⋱ \par

\begin{array}[]{ll}P(D+UV^{H})P^{H}=G^{H}+(e_{1}\otimes I_{k})U_{1}(PV)^{H}=\\ G_{2}^{H}(I+(e_{1}\otimes I_{k})Z^{H})G_{1}^{H},\end{array}

\begin{array}[]{ll}P(D+UV^{H})P^{H}=G^{H}+(e_{1}\otimes I_{k})U_{1}(PV)^{H}=\\ G_{2}^{H}(I+(e_{1}\otimes I_{k})Z^{H})G_{1}^{H},\end{array}

\widehat{U}=I_{m}-\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]^{H}.

\widehat{U}=I_{m}-\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]^{H}.

\widehat{A}=\left[\begin{array}[]{cc}L\\ &I_{k}\end{array}\right]\left(\widehat{U}+\left(\left[\begin{array}[]{cc}G^{H}\\ 0\end{array}\right]+\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]\right)\left[\begin{array}[]{cc}Q\\ 0\end{array}\right]^{H}\right)\left[\begin{array}[]{cc}R\\ &I_{k}\end{array}\right]

\widehat{A}=\left[\begin{array}[]{cc}L\\ &I_{k}\end{array}\right]\left(\widehat{U}+\left(\left[\begin{array}[]{cc}G^{H}\\ 0\end{array}\right]+\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]\right)\left[\begin{array}[]{cc}Q\\ 0\end{array}\right]^{H}\right)\left[\begin{array}[]{cc}R\\ &I_{k}\end{array}\right]

\widehat{A}=\left[\begin{array}[]{cc}A&B\\ 0&0\end{array}\right],\quad B\in\mathbb{C}^{n\times k}.

\widehat{A}=\left[\begin{array}[]{cc}A&B\\ 0&0\end{array}\right],\quad B\in\mathbb{C}^{n\times k}.

\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]^{H}\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]=2I_{k}.

\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]^{H}\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]=2I_{k}.

\widehat{U}+\left(\left[\begin{array}[]{cc}G^{H}\\ 0\end{array}\right]+\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]\right)\left[\begin{array}[]{cc}Q\\ 0\end{array}\right]^{H}=\left[\begin{array}[]{cc}I_{n}&Q\\ &0_{k}\end{array}\right]+\left[\begin{array}[]{cc}I_{k}\\ 0\end{array}\right]\left[\begin{array}[]{cc}Z\\ 0\end{array}\right]^{H}.

\widehat{U}+\left(\left[\begin{array}[]{cc}G^{H}\\ 0\end{array}\right]+\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right]\right)\left[\begin{array}[]{cc}Q\\ 0\end{array}\right]^{H}=\left[\begin{array}[]{cc}I_{n}&Q\\ &0_{k}\end{array}\right]+\left[\begin{array}[]{cc}I_{k}\\ 0\end{array}\right]\left[\begin{array}[]{cc}Z\\ 0\end{array}\right]^{H}.

\widehat{A}=\left[\begin{array}[]{cc}A&\ast\\ 0_{k,n}&0_{k,k}\end{array}\right].

\widehat{A}=\left[\begin{array}[]{cc}A&\ast\\ 0_{k,n}&0_{k,k}\end{array}\right].

C (h : m, 1 : h + k - 2)

C (h : m, 1 : h + k - 2)

P S^{H} = C (:, 1 : k) M^{- 1} C (n + 1 : m, :) = L (:, 1 : k) M^{- 1} C (n + 1 : m, :),

P S^{H} = C (:, 1 : k) M^{- 1} C (n + 1 : m, :) = L (:, 1 : k) M^{- 1} C (n + 1 : m, :),

(C (n + 1 : m, :) + M Z^{H}) R = 0,

(C (n + 1 : m, :) + M Z^{H}) R = 0,

P S^{H} = L (:, 1 : k) M^{- 1} C (n + 1 : m, :) = - L (:, 1 : k) Z^{H} = - L [I_{k}, 0]^{T} Z^{H} .

P S^{H} = L (:, 1 : k) M^{- 1} C (n + 1 : m, :) = - L (:, 1 : k) Z^{H} = - L [I_{k}, 0]^{T} Z^{H} .

X=\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right],\quad Y=\left[\begin{array}[]{cc}G^{H}\\ 0\end{array}\right]+X,\quad\quad W=\left[\begin{array}[]{cc}Q\\ 0\end{array}\right],

X=\left[\begin{array}[]{cc}Q\\ -I_{k}\end{array}\right],\quad Y=\left[\begin{array}[]{cc}G^{H}\\ 0\end{array}\right]+X,\quad\quad W=\left[\begin{array}[]{cc}Q\\ 0\end{array}\right],

\widehat{A}=\left[\begin{array}[]{cc}L\\ &I_{k}\end{array}\right]\left(\widehat{U}+YW^{H}\right)\left[\begin{array}[]{cc}R\\ &I_{k}\end{array}\right],\quad\widehat{U}=I_{m}-XX^{H}.

\widehat{A}=\left[\begin{array}[]{cc}L\\ &I_{k}\end{array}\right]\left(\widehat{U}+YW^{H}\right)\left[\begin{array}[]{cc}R\\ &I_{k}\end{array}\right],\quad\widehat{U}=I_{m}-XX^{H}.

L_{0}:=\left[\begin{array}[]{cc}L\\ &I_{k}\end{array}\right],\quad R_{0}:=\left[\begin{array}[]{cc}R\\ &I_{k}\end{array}\right],\quad X_{0}:=X,\quad Y_{0}:=Y,\quad W_{0}:=W.

L_{0}:=\left[\begin{array}[]{cc}L\\ &I_{k}\end{array}\right],\quad R_{0}:=\left[\begin{array}[]{cc}R\\ &I_{k}\end{array}\right],\quad X_{0}:=X,\quad Y_{0}:=Y,\quad W_{0}:=W.

\tilde{H} = \tilde{G}_{n - k - 1} \dots \tilde{G}_{2} \tilde{G}_{1} = [\hat{H} I_{k}] .

\tilde{H} = \tilde{G}_{n - k - 1} \dots \tilde{G}_{2} \tilde{G}_{1} = [\hat{H} I_{k}] .

A_{0} = (L_{0} Q_{0}) \cdot (Q_{0}^{H} U + T_{0} W_{0}^{H}) \cdot R_{0} .

A_{0} = (L_{0} Q_{0}) \cdot (Q_{0}^{H} U + T_{0} W_{0}^{H}) \cdot R_{0} .

A_{1} = Q_{0} \cdot (Q_{0}^{H} U R_{0} + T_{0} W_{0}^{H} R_{0}) L_{0} .

A_{1} = Q_{0} \cdot (Q_{0}^{H} U R_{0} + T_{0} W_{0}^{H} R_{0}) L_{0} .

U_{1} = Q_{0}^{H} U Q_{0} Q_{0}^{H} R_{0} = (I_{m} - \hat{X} \hat{X}^{H}) Q_{0}^{H} R_{0},

U_{1} = Q_{0}^{H} U Q_{0} Q_{0}^{H} R_{0} = (I_{m} - \hat{X} \hat{X}^{H}) Q_{0}^{H} R_{0},

\widehat{U}_{1}P=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&U_{1}(k+1:m,:)P(:,k+1:m)\end{array}\right]=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\hat{Q}\end{array}\right]

\widehat{U}_{1}P=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&U_{1}(k+1:m,:)P(:,k+1:m)\end{array}\right]=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\hat{Q}\end{array}\right]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\headers

Efficient Reduction of Compressed Unitary plus Low-rank Matrices to Hessenberg formR. Bevilacqua, G.M. Del Corso, L. Gemignani

Efficient Reduction of Compressed Unitary plus Low-rank Matrices to Hessenberg form††thanks: The research of the last two authors was partially supported by GNCS project “Tecniche innovative per

problemi di algebra lineare” and by the project sponsored by University of Pisa under the grant PRA-2017-05.

R. Bevilacqua Dipartimento di Informatica, Università di Pisa, Pisa, Italy, [email protected]

G.M. Del Corso Dipartimento di Informatica, Università di Pisa, Pisa, Italy, [email protected]

L. Gemignani Dipartimento di Informatica, Università di Pisa, Pisa, Italy, [email protected]

Abstract

We present fast numerical methods for computing the Hessenberg reduction of a unitary plus low-rank matrix $A=G+UV^{H}$ , where $G\in\mathbb{C}^{n\times n}$ is a unitary matrix represented in some compressed format using $O(nk)$ parameters and $U$ and $V$ are $n\times k$ matrices with $k<n$ . At the core of these methods is a certain structured decomposition, referred to as a LFR decomposition, of $A$ as product of three possibly perturbed unitary $k$ Hessenberg matrices of size $n$ . It is shown that in most interesting cases an initial LFR decomposition of $A$ can be computed very cheaply. Then we prove structural properties of LFR decompositions by giving conditions under which the LFR decomposition of $A$ implies its Hessenberg shape. Finally, we describe a bulge chasing scheme for converting the initial LFR decomposition of $A$ into the LFR decomposition of a Hessenberg matrix by means of unitary transformations. The reduction can be performed at the overall computational cost of $O(n^{2}k)$ arithmetic operations using $O(nk)$ storage. The computed LFR decomposition of the Hessenberg reduction of $A$ can be processed by the fast QR algorithm presented in [8] in order to compute the eigenvalues of $A$ within the same costs.

keywords:

Hessenberg reduction, Rank-structured matrices, QR Method, Bulge chasing, CMV matrix, Complexity.

{AMS}

65F15

1 Introduction

Eigenvalue computations for small rank modifications of unitary matrices represented in some compressed format is a classical topic in structured numerical linear algebra. Matrices of the form $A=D+UV^{H}$ where $D$ is a unitary $n\times n$ block diagonal matrix and $U,V\in\mathbb{C}^{n\times k}$ , $k<n$ , arise commonly in the numerical treatment of structured (generalized) eigenvalue problems [1, 2]. In particular any unitary plus low-rank matrix can be reduced in this form by a similarity (unitary) transformation and additionally matrices of this form can be directly generated by linearization techniques based on interpolation schemes applied for the solution of nonlinear eigenvalue problems [6, 18, 9, 7]. The class of unitary block upper Hessenberg matrices perturbed in the first block row or in the last block column includes block companion linearizations of matrix polynomials. These matrices are also related with computational problems involving orthogonal matrix polynomials on the unit circle [22, 21]. Constructing the sequence of orthogonal polynomials w.r.t a different basis modifies the compressed format of the unitary part by replacing the block Hessenberg shape with the block CMV shape [11, 19, 20]. Semiinfinite block upper Hessenberg and CMV unitary matrices are commonly used to represent unitary operators on a separable Hilbert space [3, 12]. Finite truncations of these matrices are unitary block Hessenberg/CMV matrices modified in the last row or column.

In most numerical methods Hessenberg reduction by unitary similarity transformations is the first step towards eigenvalue computation. Recently a fast reduction algorithm specifically tailored for block companion matrices has been presented in [5] whereas some efficient algorithms for dealing with block unitary diagonal plus small rank matrices have been developed in [17]. In particular, these latter algorithms are two-phase: in the first phase the matrix $A$ is reduced in a banded form $A_{1}$ employing a block CMV-like format to represent the unitary part. The second phase amounts to incrementally annihilate the lower subdiagonals of $A_{1}$ by means of Givens rotations which are gathered in order to construct a data-sparse compressed representation of the final Hessenberg matrix $A_{2}$ . The representation involves $O(nk)$ data storage consisting of $O(n)$ vectors of length $k$ and $O(nk)$ Givens rotations. This compression is usually known as a Givens–Vector representation [24, 25], and it can also be explicitly resolved to produce a generators-based representation [14, 15]. However, a major weakness of this approach is that both these two compressed formats are not suited to be exploited in the design of fast specialized eigensolvers for unitary plus low rank matrices using $O(n^{2}k)$ ops only.

In this paper we describe a novel $O(n^{2}k)$ backward stable algorithm for computing the Hessenberg reduction of general matrices $A\in\mathbb{C}^{n\times n}$ of the form $A=G+UV^{H}$ , where $G$ is unitary block diagonal or unitary block upper Hessenberg or block CMV with block size $k<n$ and, in the case $G$ is unitary block upper Hessenberg or block CMV, we have the additional requirement that $U=\left[I_{k},0\ldots,0\right]^{T}$ . These families include most of the important cases arising in applications.

This algorithm circumvents the drawback of the method proposed in [17] by introducing a different data-sparse compressed representation of the final Hessenberg matrix which is effectively usable in fast eigenvalue schemes. In particular, the representation is suited for the fast eigensolver for unitary plus low rank matrices developed in [8]. Our derivation is based on three key ingredients or building blocks:

A condensed representation of the matrix $A$ (or of a matrix unitary similar to $A$ ) which can be specified as $A=L(I+(e_{1}\otimes I_{k})Z^{H})R=LFR$ , where $L$ is the product of $k$ unitary lower Hessenberg matrices, $R$ is the product of $k$ unitary upper Hessenberg matrices and the middle factor $F$ is the identity matrix perturbed in the first $k$ rows.

In the case matrix $G$ is block upper Hessenberg or block diagonal we can obtain the $LFR$ representation in a simple way that we clarify in Section 2.2 and 2.3. In the case $G$ is unitary block CMV matrix we provide a suitable extension of the well known factorization of CMV matrices as product of two block diagonal unitary matrices that are both the direct sum of $2\times 2$ or $1\times 1$ unitary blocks (compare with [20] and the references given therein). Specifically, block CMV matrices with blocks of size $k$ are $2k$ -banded unitary matrices allowing a ’staircase-shaped’ profile. It is shown that a block CMV matrix with blocks of size $k$ admits a factorization as product of two unitary block diagonal matrices with $k\times k$ diagonal blocks. It follows that the block CMV matrix can be decomposed as the product of a unitary lower $k-$ Hessenberg matrix multiplied by a unitary upper $k-$ Hessenberg matrix. 2. 2.

An embedding technique which for a given triple $(L,F,R)$ associated with $A$ makes it possible to construct a larger matrix $\widehat{A}\in\mathbb{C}^{(n+k)\times(n+k)}$ which is still unitary plus rank $-k$ and it can be factored as $\widehat{A}=\widehat{L}\cdot\widehat{F}\cdot\widehat{R}$ , where $\widehat{L}$ is the product of $k$ unitary lower Hessenberg matrices, $\widehat{R}$ is the product of $k$ unitary upper Hessenberg matrices and the middle factor $\widehat{F}$ is unitary block diagonal plus rank $-k$ with some additional properties. 3. 3.

A theoretical result which provides conditions under which a matrix specified in the form $\widehat{A}=\widehat{L}\cdot\widehat{F}\cdot\widehat{R}$ turns out to be Hessenberg.

Combining together these ingredients allows the design of a specific bulge-chasing strategy for converting the $LFR$ factored representation of $\widehat{A}$ into the $LFR$ decomposition of an upper Hessenberg matrix $\widetilde{A}$ unitarily similar to $\widehat{A}$ . The final representation of $\widetilde{A}$ thus involves $O(nk)$ data storage consisting of $O(k)$ vectors of length $n$ and $O(nk)$ Givens rotations. The reduction to Hessenberg form turns out to have the same asymptotic complexity of eigensolvers for unitary plus low rank matrices and furthermore, this representation is suited to be used directly by the fast eigensolver for unitary plus low rank matrices developed in [8].

The paper is organized as follows. In Section 2 we introduce the $LFR$ representations of unitary plus rank $-k$ matrices by devising fast algorithms for transforming a matrix $A$ into its $LFR$ format provided that $A$ belongs to some special classes. In Section 3 we investigate the properties of $LFR$ representations of unitary plus rank $-k$ Hessenberg matrices and we describe a suitable technique to embed the matrix $A$ into a larger matrix $\widehat{A}$ by mantaining its structural properties. In Section 4 we present our algorithm which modifies the $LFR$ representation of $\widehat{A}$ by computing the corresponding $LFR$ representation of a unitarily similar Hessenberg matrix. Finally, numerical experiments are discussed in Section 5 whereas conclusions and future work are drawn in Section 6.

2 The $LFR$ Format of Unitary plus Rank- $k$ Matrices

In this section we introduce a suitable compressed representation of unitary plus rank- $k$ matrices which can be exploited for the design of fast Hessenberg reduction algorithms.

Definition 2.1.

A unitary plus rank- $k$ matrix $A\in\mathbb{C}^{n\times n}$ can be represented in the LFR format if there is a triple $(L,F,R)$ of matrices such that:

$A=LFR$ ; 2. 2.

$L\in\mathbb{C}^{n\times n}$ * is the product of $k$ unitary lower Hessenberg matrices;* 3. 3.

$R\in\mathbb{C}^{n\times n}$ * is the product of $k$ unitary upper Hessenberg matrices;* 4. 4.

$F=Q+[I_{k},0]^{T}Z^{H}\in\mathbb{C}^{n\times n}$ * is a unitary plus rank* $-k$ * matrix, where $Q$ is a block diagonal unitary matrix of the form $Q=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\hat{Q}\end{array}\right]$ , with $\hat{Q}$ unitary Hessenberg and $Z\in\mathbb{C}^{n\times k}$ .*

In the sequel of this section we present some fast algorithms for computing the $LFR$ format of a unitary plus rank- $k$ matrix $A\in\mathbb{C}^{n\times n}$ specified as follows:

•

$A=G+[I_{k},0]^{T}Z^{H}$ , $Z\in\mathbb{C}^{n\times k}$ , and $G$ is unitary block CMV with block size $k<n$ ;

•

$A=H+[I_{k},0]^{T}Z^{H}$ , $Z\in\mathbb{C}^{n\times k}$ , and $H$ is unitary block upper Hessenberg with block size $k<n$ ;

•

$A=D+UV^{H}$ , $U,V\in\mathbb{C}^{n\times k}$ , and $D$ is unitary block diagonal with block size $k<n$ .

These three cases cover the most interesting structures of low-rank perturbation of unitary matrices. In the general case of unitary matrices, where it is not known the spectral factorization of the unitary part or the unitary matrix cannot be represented in terms of a linear number of parameters, we cannot expect to recover the eigenvalues – even only of the unitary part – in $o(n^{3})$ .

In the following sections we investigates into the above three cases.

2.1 Small Rank Modifications of Unitary Block CMV Matrices

A block analogue of the CMV form of unitary matrices has been introduced in [17, 3].

Definition 2.2 (CMV shape).

A unitary matrix $G\in\mathbb{C}^{n\times n}$ is said to be CMV structured with block size $k$ if there exist $k\times k$ non-singular matrices $R_{i}$ and $L_{i}$ , respectively upper and lower triangular, such that

[TABLE]

or

[TABLE]

*where the symbol $\times$ has been used to identify (possibly) nonzero $k\times k$ blocks. *

Block CMV matrices are associated with matrix orthogonal polynomials on the unit circle and the structure of the matrix depends on the choice of the starting basis of the set of matrix polynomials to be orthogonalized. In particular, $G$ fits the block structure shown in Definition 2.2 if $\left\{I_{k},zI_{k},z^{-1}I_{k},\ldots\right\}$ or $\left\{I_{k},z^{-1}I_{k},zI_{k},\ldots\right\}$ are considered. In what follows for the sake of simplicity we always assume that $G$ satisfies the block structure (1). Furthermore, in order to simplify the notation we often assume that $n$ is a multiple of $2k$ , so the above structures fit “exactly” in the matrix. However, this is not restrictive and the theory presented here continues to hold in greater generality. In practice, one can deal with the more general case by allowing the blocks in the bottom-right corner of the matrix to be smaller.

Notice that a matrix in CMV form with blocks of size $k$ is, in particular, $2k$ -banded. The CMV structure with blocks of size $1$ has been proposed as a generalization of what the tridiagonal structure is for Hermitian matrices in [11] and [19]. A further analogy between the scalar and the block case is derived from the Nullity Theorem [16] that is here applied to unitary matrices.

Lemma 2.3 (Nullity Theorem).

Let $U$ be a unitary matrix of size $n$ . Then

[TABLE]

where $J=\{1,2,\ldots,n\}$ and $\alpha$ and $\beta$ are subsets of $J$ . If $\alpha=\{1,\ldots,h\}$ an $\beta=J\backslash\alpha$ we have

[TABLE]

From Lemma 2.3 applied to a block CMV structured matrix $G$ of block size $k$ we find that for $p>0$ :

[TABLE]

which gives

[TABLE]

Pictorially we are observing rank constraints on the following blocks

[TABLE]

and by similar arguments on the corresponding blocks in the upper triangular portion.

In the scalar case with $k=1$ these conditions make it possible to find a factorization of the CMV matrix as product of two block diagonal matrices usually referred to as the classical Schur parametrization [10]. Similarly, here we introduce a block counterpart of the Schur parametrization which gives a useful tool to encompass the structural properties of block CMV representations.

Lemma 2.4 (CMV factorization).

Let $G$ be a unitary CMV structured matrix with blocks of size $k$ as defined in Definition 2.2. Then $G$ can be factored in two block diagonal unitary matrices $G=G_{1}G_{2}$ of the form:

[TABLE]

*such that $G_{2,s+1}$ has $k$ rows and columns and all the other blocks $G_{i,j}$ have $2k$ rows and columns and bandwidth $k$ with both $G_{i,j}(k+1:2k,1:k)$ and $G_{i,j}(1:k,k+1:2k)$ triangular matrices of full rank. Moreover, each matrix $G$ admitting such a factored form is in turn CMV. *

Proof 2.5.

The proof of this result is constructive, and can be obtained by performing a block QR decomposition. We notice that if we compute a QR decomposition of the top-left $2k\times k$ block of $G$ we have

[TABLE]

where $\tilde{\times}$ identifies the blocks that have been altered by the transformation and the block in position $(1,1)$ can be assumed to be the identity matrix. Notice that in the first row the blocks in the second and third columns have to be zero due to $G$ being unitary, and that the $R_{2,1}$ block is nonsingular upper triangular since it inherits the properties of $R_{1}$ .

We can continue this process by computing the QR factorization of $\left[\begin{smallmatrix}\times\\ R_{2}\end{smallmatrix}\right]$ . Notice that, from the application of the Nullity Theorem 2.3 the block identified by $\left[\begin{smallmatrix}\times&\times\\ R_{2}&\times\\ \end{smallmatrix}\right]$ in the picture has rank at most $k$ . This also holds for all the other blocks for the same kind. In particular, computing the QR factorization of the first $k$ columns and left-multiplying by $Q^{H}$ will put to zero also the block on the right of $R_{2}$ . We will then get the following factorization:

[TABLE]

*where we notice that, as before, the block $R_{4,3}$ is nonsingular upper triangular and that some blocks in the upper part have been set to zero thanks to the unitary property. The process can then be continued until the end of the matrix, providing a factorization of $G$ as product of two unitary block diagonal matrices, that is $G=\widehat{G}_{1}\widehat{G}_{2}$ . This factorization can further be simplified by means of a block diagonal scaling $G=(\widehat{G}_{1}D)(D^{H}\widehat{G}_{2})=G_{1}G_{2}$ with $D=\operatorname{diag}(D_{1},\ldots,D_{2s})$ , $D_{2j-1}=I_{k}$ and $D_{2j}$ $k\times k$ unitary matrices determined so that the blocks $G_{i,j}$ are of bandwidth $k$ , that is the outermost blocks in $G_{1}$ and $G_{2}$ are triangular. For the sake of illustration consider $j=1$ and let $Q_{1,2}^{H}=QR$ be a QR decomposition of $Q_{1,2}^{H}$ . By setting $D_{2}=Q$ we obtain that $Q_{1,2}D_{2}=R^{H}$ and, moreover, from $L_{3}=Q_{1,2}D_{2}(G_{2})_{2,3}=R^{H}(G_{2})_{2,3}$ it follows that the block of $G_{2}$ in position $(2,3)$ also exhibits a lower triangular structure. The construction of the remaining blocks $D_{2j}$ , $j>1$ , proceeds in a similar way. *

Pictorially, the above result gives the following structure of $G_{1}$ and $G_{2}$ :

[TABLE]

Now, let us assume that a matrix $A\in\mathbb{C}^{n\times n}$ is such that $A=G^{T}+[I_{k},0]^{T}Z^{H}$ , $Z\in\mathbb{C}^{n\times k}$ , and $G$ is unitary block CMV with block size $k<n$ . By replacing $G$ with its block diagonal factorization we obtain that $A=G_{2}^{T}(I_{n}+[I_{k},0]^{T}Z^{H}\bar{G}_{1})G_{1}^{T}$ . Since the left-hand and the right-hand side matrices are unitary $k-$ banded it follows that they can both be factored as the product of $k$ unitary Hessenberg matrices. Hence, we have the following.

Theorem 2.6.

*Let $A\in\mathbb{C}^{n\times n}$ be such that $A=G^{T}+[I_{k},0]^{T}Z^{H}$ , $Z\in\mathbb{C}^{n\times k}$ , and $G$ is unitary block CMV with the block structure shown in Equation 1. Then $A$ can be represented in the LFR format as $A=G_{2}^{T}(I_{n}+[I_{k},0]^{T}\widehat{Z}^{H})G_{1}^{T}$ where $L=G_{2}^{T}$ , $R=G_{1}^{T}$ , $G=G_{1}G_{2}$ is the decomposition provided in Lemma 2.4 and $F=I_{n}+[I_{k},0]^{T}\tilde{Z}^{H}$ , $\tilde{Z}^{H}=Z^{H}\bar{G}_{1}$ . *

The overall cost of computing this condensed LFR representation of the unitary plus rank- $k$ matrix $A$ is $\mathcal{O}(nk^{2})$ flops using $\mathcal{O}(nk)$ memory storage.

2.2 Small Rank Modifications of Unitary Block Hessenberg Matrices

The class of perturbed unitary block Hessenberg matrices includes the celebrated block companion forms which are the basic tool in the construction of matrix linearizations of matrix polynomials. To be specific let $A\in\mathbb{C}^{n\times n}$ be a matrix such that $A=H+[I_{k},0]^{T}Z^{H}$ , $Z\in\mathbb{C}^{n\times k}$ , and $H$ is unitary block upper Hessenberg with block size $k<n$ . A compressed LFR format of a matrix unitarily similar to $A$ can be computed as follows. First of all we can suppose that all the subdiagonal blocks $H_{i+1,i}$ , $1\leq i\leq n/k$ , are upper triangular. If not we consider the unitary block diagonal matrix $P$ defined by $P=\operatorname{blkdiag}\left[P_{1},P_{2},\ldots,P_{n/k}\right]$ where $P_{i}\in\mathbb{C}^{k\times k}$ , $P_{1}=I_{k}$ and $H_{i+1,i}P_{i}=P_{i+1}R_{i}$ is a QR decomposition of the matrix $H_{i+1,i}P_{i}$ , $1\leq i\leq n/k-1$ . Then the matrix $\widetilde{A}=P^{H}AP$ is such that $\widetilde{A}=\widetilde{H}+[I_{k},0]^{T}\widetilde{Z}^{H}$ and $H$ is unitary block upper Hessenberg with block size $k<n$ and $\widetilde{H}_{i+1,i}=R_{i}$ , $1\leq i\leq n/k-1$ . Hence, the matrix $\widetilde{H}$ is banded with lower bandwidth $k$ and therefore the factorization $\widetilde{A}=I_{n}(I_{n}+[I_{k},0]^{T}\widehat{Z}^{H})\widetilde{H}$ gives a suitable LFR representations of $\widetilde{A}$ . Summing up we have the following.

Theorem 2.7.

*Let $A\in\mathbb{C}^{n\times n}$ be such that $A=H+[I_{k},0]^{T}Z^{H}$ , $Z\in\mathbb{C}^{n\times k}$ , and $H$ is unitary block upper Hessenberg with block size $k<n$ . Then there exists a unitary block diagonal matrix $P=\operatorname{blkdiag}\left[P_{1},P_{2},\ldots,P_{n/k}\right]$ , $P_{i}\in\mathbb{C}^{k\times k}$ , $P_{1}=I_{k}$ such that $\widetilde{A}=P^{H}AP$ can be represented in the LFR format as $\widetilde{A}=I_{n}(I_{n}+[I_{k},0]^{T}\widehat{Z}^{H})\widetilde{H}$ where $L=I_{n}$ , $R=\widetilde{G}=P^{H}GP$ and $F=I_{n}+[I_{k},0]^{T}\widehat{Z}^{H}$ , $\widehat{Z}^{H}=Z^{H}\widetilde{H}^{H}$ . *

The overall cost of computing this condensed LFR representation of the unitary plus rank- $k$ matrix $A$ is $\mathcal{O}(nk^{2})$ flops using $\mathcal{O}(nk)$ memory storage.

2.3 Small Rank Modifications of Unitary Block Diagonal Matrices

The unitary block diagonal matrix reduces to a unitary diagonal matrix up to a similarity transformation which can be performed within $O(nk^{2})$ operations. The interest toward the properties of block CMV matrices is renewed in [17] where a general scheme is proposed to transform a unitary diagonal plus a rank $-k$ matrix into a block CMV structured matrix plus a rank $-k$ perturbation located in the first $k$ rows only. More specifically we have the following [17].

Theorem 2.8.

*Let $D\in\mathbb{C}^{n\times n}$ be a unitary diagonal matrix and $U\in\mathbb{C}^{n\times k}$ of full rank $k$ . Then, there exists a unitary matrix $P$ such that $G=PDP^{H}$ is CMV structured with block size $k$ and the block structure shown in Definition 2.2 and $PU=(e_{1}\otimes I_{k})U_{1}$ for some $U_{1}\in\mathbb{C}^{k\times k}$ . The matrices $P,G$ and $U_{1}$ can be computed with $O(n^{2}k)$ operations. *

By applying Theorem 2.8 to the matrix pair $(D^{H},U)$ we find that there exists a unitary matrix $P$ such that $G=PD^{H}P^{H}$ is CMV structured with block size $k$ and $PU=(e_{1}\otimes I_{k})U_{1}$ . In view of Lemma 2.4 this yields

[TABLE]

where $Z=G_{1}^{H}PVU_{1}^{H}\in\mathbb{C}^{n\times k}.$ Since the left-hand and the right-hand side matrices are unitary $k-$ banded it follows that they can both be factored as the product of $k$ unitary Hessenberg matrices. In this way we obtain the next result.

Theorem 2.9.

*Let $A\in\mathbb{C}^{n\times n}$ be such that $A=D+UV^{H}$ with $U,V\in\mathbb{C}^{n\times k}$ , and $D$ unitary diagonal. Then there exists a unitary matrix $P\in\mathbb{C}^{n\times n}$ such that $G=PDP^{H}$ has the block CMV structure shown in Definition 2.2 and $PU=(e_{1}\otimes I_{k})U_{1}$ for some $U_{1}\in\mathbb{C}^{k\times k}$ . Moreover, $\widetilde{A}=PAP^{H}$ can be represented in the LFR format as $\widetilde{A}=G_{2}^{H}(I_{n}+[I_{k},0]^{T}Z^{H})\widetilde{G}_{1}^{H}$ where $L=G_{2}^{H}$ , $R=G_{1}^{H}$ , $PDP^{H}=G=G_{1}G_{2}$ is the factorization of $G$ provided in Lemma 2.4 and $F=I_{n}+[I_{k},0]^{T}Z^{H}$ , $Z^{H}=U_{1}(PV)^{H}$ . *

The overall cost of computing this condensed LFR representation of the unitary plus rank- $k$ matrix $A$ is $\mathcal{O}(n^{2}k)$ flops using $\mathcal{O}(nk)$ memory storage.

In the next sections we investigate the properties of the Hessenberg reduction of a matrix given in the $LFR$ format.

3 Factored Representations of Hessenberg Matrices

In this section we investigate suitable conditions under which a factored representation $A=LFR\in\mathbb{C}^{m\times m}$ , where $L$ is the product of $k<n$ unitary lower Hessenberg matrices, $R$ is the product of $k$ unitary upper Hessenberg matrices and the middle factor $F$ is unitary plus rank $-k$ , specifies a matrix in Hessenberg form. In Section 4 we will discuss the chasing algorithm for reducing, by unitary similarity, a matrix of the form $L(I+(e_{1}\otimes I_{k})Z^{H})R$ to Hessenberg form maintaining the factorization and enforcing the properness of the factor $L$ to avoid breakdown of the subsequent $QR$ iterations.

A key ingredient is the properness of the generalized Hessenberg factors.

Definition 3.1.

*A matrix $H\in\mathbb{C}^{m\times m}$ is called $k$ -upper Hessenberg if $h_{ij}=0$ when $i>j+k$ . Similarly, $H$ is called $k$ -lower Hessenberg if $h_{ij}=0$ when $j>i+k$ . In addition, when $H$ is $k$ -upper Hessenberg ( $k$ -lower Hessenberg) and the outermost entries are non-zero, that is, $h_{j+k,j}\neq 0$ ( $h_{j,j+k}\neq 0$ ), $1\leq j\leq m-k$ , then the matrix is called proper. *

Note that for $k=1$ a Hessenberg matrix $H$ is proper iff it is unreduced. Also, a $k$ -upper Hessenberg matrix $H\in\mathbb{C}^{m\times m}$ is proper iff $\det(H(k+1:m,1:m-k))\neq 0$ . Similarly a $k$ -lower Hessenberg matrix $H$ is proper iff $\det(H(1:m-k,k+1:m))\neq 0$ .

An important property of any unitary upper Hessenberg matrix $H\in\mathbb{C}^{m\times m}$ is that it can be represented as product of elementary transformations, i.e., $H=\mathcal{G}_{1}\mathcal{G}_{2}\cdots\mathcal{G}_{m-1}\mathcal{D}_{m}$ where $\mathcal{G}_{\ell}=I_{\ell-1}\oplus G_{\ell}\oplus I_{m-\ell-1}$ with $G_{\ell}=\left[\begin{array}[]{cc}\alpha_{\ell}&\beta_{\ell}\\ -\beta_{\ell}&\bar{\alpha}_{\ell}\end{array}\right]$ , $|\alpha_{\ell}|^{2}+\beta_{\ell}^{2}=1$ , $\alpha_{\ell},\in\mathbb{C}$ , $\beta_{\ell}\in\mathbb{R},\beta_{\ell}\geq 0$ , are unitary Givens rotations and $\mathcal{D}_{m}=I_{m-1}\oplus\theta_{m}$ with $|\theta_{m}|=1$ . In this way the matrix $H$ is stored by two vectors of length $m$ formed by the elements $\alpha_{\ell},\beta_{\ell}$ , $1\leq\ell\leq m-1$ and $\theta_{m}$ . The same representation also extends to unitary $k$ -upper Hessenberg matrices specified as the product of $k$ unitary upper Hessenberg matrices multiplied on the right by a unitary diagonal matrix which is the identity matrix modified in the last diagonal entry. Lower unitary Hessenberg matrices can be parametrized similarly as $H=\mathcal{G}_{m-1}\mathcal{G}_{m-2}\cdots\mathcal{G}_{1}\mathcal{D}_{m}$ .

Another basic property of unitary plus rank $-k$ matrices is the existence of suitable embeddings which maintain their structural properties. The embedding turns out to be crucial to ensure the properness of the factor $L$ and guarantee the safe application of implicit $QR$ iterations. The embedding is also important for the bulge chasing algorithm as we explain in the next section. The following result is first proved in [8] and here specialized to a matrix of the form determined in Theorems 2.6, 2.7 and 2.9.

Theorem 3.2.

Let $A\in\mathbb{C}^{n\times n}$ be such that $A=L(I+(e_{1}\otimes I_{k})Z^{H})R=LFR$ , where $L$ and $R$ are unitary and $Z\in\mathbb{C}^{n\times k}$ . Let $Z=QG$ , $G\in\mathbb{C}^{k\times k}$ , be the economic QR factorization of $Z$ . Let $\widehat{U}\in\mathbb{C}^{m\times m}$ , $m=n+k$ , be defined as

[TABLE]

Then it holds

$\widehat{U}$ * is unitary;* 2. 2.

the matrix $\widehat{A}\in\mathbb{C}^{m\times m}$ given by

[TABLE]

satisfies

[TABLE]

Proof 3.3.

Property 1 follows by direct calculations from

[TABLE]

For Property 2 we find that

[TABLE]

The unitary matrices $L$ and $R$ given in Theorems 2.6, 2.7 and 2.9 are $k$ -Hessenberg matrices. The same clearly holds for the larger matrices $\operatorname{diag}(L,I_{k})$ and $\operatorname{diag}(R,I_{k})$ occurring in the factorization of $\widehat{A}$ . The next result is the main contribution of this section and it provides conditions under which a matrix specified as a product $L\cdot\tilde{F}\cdot R$ , where $L$ is a unitary $k$ -lower Hessenberg matrix $R$ is a unitary $k$ -upper Hessenberg matrix and $\tilde{F}$ is a unitary matrix plus a rank $-k$ correction, is in Hessenberg form.

In fact, once we apply the embedding described by Theorem 3.2 to $A=L(I+(e_{1}\otimes I_{k})Z^{H})R$ , the matrix obtained, $\widehat{A}$ , is no more in the LFR format since the middle factor is not in the prescribed format required by Definition 2.1. Moreover $\widehat{L}=L\oplus I_{k}$ is not a proper matrix, making implicit $QR$ iterations subject to breakdown.

Theorem 3.4.

Let $L,R\in\mathbb{C}^{m\times m}$ , $m=n+k$ , be two unitary matrices, where $L$ is a proper unitary $k$ -lower Hessenberg matrix and $R$ is a unitary $k$ -upper Hessenberg matrix. Let $Q$ be a block diagonal unitary upper Hessenberg matrix of the form $Q=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\hat{Q}\end{array}\right]$ , with $\hat{Q}$ $n\times n$ unitary Hessenberg. Let $F=Q+[I_{k},0]^{T}Z^{H}$ be a unitary plus rank $-k$ matrix with $Z\in\mathbb{C}^{m\times k}$ . Suppose that the matrix $\widehat{A}=LFR$ satisfies the block structure

[TABLE]

*Then $\widehat{A}$ is an upper Hessenberg matrix. *

Proof 3.5.

From Lemma 2.3 we find that $M=L(n+1:m,1:k)$ is nonsingular due to the properness of $L$ .

Now, let us consider the matrix $C=L\,Q$ . This matrix is unitary with a $k$ -quasiseparable structure below the $k$ -th upper diagonal. Indeed, for any $h,h=2,\ldots n+1$ we have

[TABLE]

Applying Lemma 2.3 we have $\operatorname{rank}(L(h:m,1:h+k-1))=k$ , implying that also $\operatorname{rank}(C(h:m,1:h+k-2))\leq k$ . Since $C(n+1:m,1:k)=L(n+1:m.:)Q(:,1:k)=M$ is non singular, we conclude that $\operatorname{rank}(C(h:m,1:h+k-2))=k$ , $2\leq h\leq n+1$ .

From this observation we can then find a set of generators $P,S\in\mathbb{C}^{(m\times k)}$ and a $(1-k)$ -upper Hessenberg matrix $U_{k}$ such that $U_{k}(1,k)=U_{k}(n,m)=0$ so that $C=PS^{H}+U_{k}$ [13].

Then we can recover the rank $k$ correction $PS^{H}$ from the left-lower corner of $C$ obtaining

[TABLE]

since $C(:,1:k)=LQ(:,1:k)=L(:,1:k)$ . Notice that $B=U_{k}\,R$ is upper Hessenberg as it is the product of a $(1-k)$ -upper Hessenberg matrix by a $k$ -upper Hessenberg matrix. Moreover, we find that $B(n+1:m,:)=U_{k}(n+1:m,:)R=0$ since $U_{k}(n+1:m,:)=0$ . From the block structure of $\widehat{A}$ there follows that

[TABLE]

which gives

[TABLE]

*Hence $U_{k}=L(Q+[I_{k},0]^{T}Z^{H})=L\,F$ and therefore $B=U_{k}\,R=LFR=\widehat{A}$ which concludes the proof. *

4 The Bulge Chasing Algorithm

In this section we present a bulge-chasing algorithm relying upon Theorem 3.4 to compute the Hessenberg reduction of the matrix $\widehat{A}$ given as in Theorem 3.2, i.e., the embedding of $A=L(I+(e_{1}\otimes I_{k})Z^{H})R$ . We recall that $Q$ and $G$ are the factors of the economic $QR$ factorization of $Z$ .

Let us set

[TABLE]

so that we have

[TABLE]

Observe that $X(k+1:m,:)=Y(k+1:m,:)$ and, moreover, $Y(n+1:m,:)=-I_{k}$ which implies $\operatorname{rank}(Y)=k$ . In the preprocessing phase we initialize

[TABLE]

Notice that $L_{0}$ is a unitary $k$ -lower Hessenberg matrix and $R_{0}$ is a unitary $k$ -upper Hessenberg matrix and, therefore, they can both be represented by the product of $k$ Hessenberg matrices. This property will be maintained under the bulge chasing process. In the cases considered in this paper, we rely on the additional structure of $L_{0}$ namely that $L_{0}$ is also $k$ -upper Hessenberg as we can observe from Theorems 2.6, 2.7 and 2.9.

In this section we make use of the following technical result.

Lemma 4.1.

Let $B\in\mathbb{C}^{n\times n}$ be a unitary $k$ Hessenberg matrix. Let $H\in\mathbb{C}^{n\times n}$ , be a unitary Hessenberg obtained as a sequence of ascending or descending Givens transformations acting on two consecutive rows, i.e. $H={\mathcal{G}}_{n-1}{\mathcal{G}}_{n-2},\cdots{\mathcal{G}}_{1}$ if $H$ is lower Hessenberg or $H={\mathcal{G}}_{1}{\mathcal{G}}_{2}\cdots{\mathcal{G}}_{n-1}$ if $H$ is upper Hessenberg. Then, there exist a unitary $k$ Hessenberg matrix $\tilde{B}$ (with the same orientation as $B$ ) and a unitary Hessenberg matrix $\tilde{H}$ such that $HB=\tilde{B}\tilde{H}$ where

•

$\tilde{H}=\begin{bmatrix}I_{k}&\\ &\hat{H}\end{bmatrix}$ * if $B$ is $k$ -lower Hessenberg,*

•

$\tilde{H}=\begin{bmatrix}\hat{H}&\\ &I_{k}\end{bmatrix}$ * if $B$ is $k$ -upper Hessenberg,*

*and $\hat{H}$ has the same orientation of $H$ . *

Proof 4.2.

We prove the Lemma only in the case $H$ is lower Hessenberg and $B$ is $k$ -upper Hessenberg. We need to move each of the $n-1$ Givens rotations of $H$ on the right of $B$ . The first $k$ Givens rotations of $H$ , namely ${\mathcal{G}}_{1},\ldots,{\mathcal{G}}_{k}$ , when applied to $B$ do not destroy the $k$ -lower Hessenberg structure of $B$ , so that ${\mathcal{G}}_{k}{\mathcal{G}}_{k-1}\cdots{\mathcal{G}}_{1}B=\hat{B}$ still $k$ -lower Hessenberg. When we apply ${\mathcal{G}}_{k+1}$ to $\hat{B}$ a bulge is produced in position $(k+2,1)$ , and we need to apply a rotation on the first two columns of ${\mathcal{G}}_{k+1}\hat{B}$ to remove the bulge, i.e. ${\mathcal{G}}_{k+1}\hat{B}=\hat{B}_{1}\tilde{\mathcal{G}}_{1}$ , similarly we can remove each of the remaining $n-k-1$ Givens rotations. At step $i$ we have ${\mathcal{G}}_{k+i}\hat{B}_{i-1}=\hat{B}_{i}\tilde{\mathcal{G}}_{i}$ . The last Givens ${\mathcal{G}}_{n-1}$ produces a bulge in position $(n,n-k-1)$ which can be removed by the rotation $\tilde{\mathcal{G}}_{n-k-1}$ acting on the columns $(n-k-1,n-k)$ . We do not need to rotate the columns with indices between $n-k$ and $n$ , so that

[TABLE]

*We can similarly prove the remaining three cases. *

The reduction of $\widehat{A}=\widehat{A}_{0}$ in Hessenberg form proceeds in three steps according to Theorem 3.4. The first two steps amount to determine a different representation of the same matrix $\widehat{A}_{0}$ . In particular after these two steps the rank-correction inside the brackets is confined to the first $k$ -rows, while the $L_{0}$ factor on the left of the representation is substituted by a factor which is proper, and still with the lower $k$ -Hessenberg structure. The third step is a bulge-chasing scheme to complete the Hessenberg reduction.

(QR decomposition of $Y_{0}$ ) We compute the full QR factorization of $Y_{0}=Q_{0}T_{0}$ . Since $Y_{0}$ is full rank the matrix $\hat{T}_{0}=T_{0}(1:k,:)$ is invertible and, moreover, the matrix $Q_{0}$ can be taken as a $k$ -lower Hessenberg proper matrix (see Lemma 2.4 of [8]). We can write

[TABLE]

Then the matrix $\widehat{A}_{1}\colon=L_{0}^{H}\widehat{A}_{0}L_{0}$ is such that

[TABLE]

Notice that $\widehat{U}_{1}:=Q_{0}^{H}\widehat{U}R_{0}$ is a unitary $2k$ -upper Hessenberg matrix. Indeed, we have that

[TABLE]

where $\hat{X}:=Q_{0}^{H}X$ and $\hat{X}(2k+1:m,:)=-Q_{0}^{H}(2k+1:m,1:k)G^{H}=0$ since $Q_{0}^{H}(2k+1:m,1:k)=0$ . Therefore, it holds $\widehat{U}_{1}=((I_{2k}-\hat{X}(1:2k,:)\hat{X}^{H}(:,1:2k))\oplus I_{m-2k})Q_{0}^{H}R_{0}$ which, for the block diagonal structure of $I_{m}-\hat{X}\hat{X}^{H}$ , turns out to be $2k$ -upper Hessenberg. 2. 2.

(Block decomposition of $\widehat{U}_{1}$ ) We compute the full QR factorization of $\widehat{U}_{1}^{H}(:,1:k)$ . Specifically we determine a unitary matrix $P$ such that $\widehat{U}_{1}(1:k,:)P=\left[I_{k},0\right]$ , and such $P$ can be taken in $k$ -lower Hessenberg form (see Lemma 2.4 of [8]). The matrix

[TABLE]

where $\hat{Q}$ is a unitary $k$ -upper Hessenberg matrix, due to the fact that $U_{1}(k+1:m,:)$ is $k$ -upper Hessenberg and $P(:,k+1:m)$ is lower triangular. We obtain that

[TABLE]

which gives

[TABLE]

Applying $k$ times Lemma 4.1, observing that $L_{0}$ is $k-$ banded (i.e. simultaneously $k$ -upper and $k$ -lower Hessenberg) we can factorize $L_{0}Q_{0}=Q_{1}L_{1}$ where $Q_{1}$ is a unitary $k$ -lower Hessenberg matrix and $L_{1}=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\hat{L}_{1}\end{array}\right]$ where $\hat{L}_{1}$ is a unitary $k$ -upper Hessenberg matrix. It follows that

[TABLE]

Where the matrix $\widehat{U}_{2}:=L_{1}\widehat{U}_{1}P$ satisfies $\widehat{U}_{2}=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\tilde{U}_{2}\end{array}\right]$ where $\tilde{U}_{2}$ is a unitary $2k$ -upper Hessenberg matrix, and $W_{1}:=P^{H}R_{0}^{H}W_{0}\hat{T}_{0}^{H}$ , where $\hat{T}_{0}=T(1:k,1:k)$ . Observe that $Q_{0}(n+1:m,1:k)=Q_{1}(n+1:m,1:k)$ and, moreover $Q_{0}(n+1:m,1:k)$ is nonsingular, because $Q_{0}$ is proper. From Lemma 2.3 this implies the properness of $Q_{1}$ . This property is maintained in the subsequent steps of the reduction process so that the final matrix is guaranteed to be proper as prescribed in Theorem 3.4.

At the end of this step the enlarged matrix $\widehat{A}$ has been reduced to a product of a proper $k$ -lower Hessenberg matrix $Q_{1}$ on the left, a unitary factor corrected in the first $k$ rows i.e., the term inside the brackets, and a $k$ -upper Hessenberg matrix, i.e., $P^{H}$ . Step 3 consists of the reduction of $\hat{U}_{2}$ to Hessenberg form so that the final matrix will be unitarly similar to $\widehat{A}$ and in the $LFR$ format. 3. 3.

(Hessenberg reduction of $\hat{U}_{2}$ ) We now need to work on the representation of $\widehat{A}_{0}$ in equation (4) to reduce the inner matrix $\widehat{U}_{2}$ in Hessenberg form by means of a bulge-chasing procedure. Indeed Theorem 3.4 ensures that the matrix obtained will be in the LFR format and in Hessenberg form. These transformations will not affect the properness of the $k$ -lower Hessenberg term on the left.

For the sake of illustration let us consider the first step. Let us determine a unitary upper Hessenberg matrix $\mathcal{G}_{1}\in\mathbb{C}^{2k\times 2k}$ such that

[TABLE]

Then setting $G_{1}=(I_{k+1}\oplus{\mathcal{G}}_{1}\oplus I_{n-2k-1})$ , we have

[TABLE]

The application of $G_{1}^{H}$ on the right of the matrix $Q_{1}$ by computing $Q_{1}(:,k+2:3k+1)\mathcal{G}_{1}^{H}$ creates a bulge formed by an additional segment above the last nonzero superdiagonal of $Q_{1}$ . This segment can be annihilated by a novel unitary upper Hessenberg matrix $G_{2}$ whose active part $\mathcal{G}_{2}\in\mathbb{C}^{2k\times 2k}$ works on the left of $Q_{1}(:,k+2:3k+1)\mathcal{G}_{1}^{H}$ by acting on the rows of indices 2 through $2k+1$ . We can then apply a similarity transformation to remove the bulge

[TABLE]

where $Q_{2}:=G_{2}Q_{1}G_{1}^{H}$ . The active part of $G_{2}^{H}$ , the $2k\times 2k$ matrix $\mathcal{G}_{2}^{H}$ , acts on the right of $P^{H}$ producing a bulge which can be zeroed by a unitary upper Hessenberg matrix $\mathcal{G}_{3}\in\mathbb{C}^{2k\times 2k}$ working on rows from $k+2$ to $3k+1$ of $P^{H}G_{2}^{H}$ . Then, the matrix

[TABLE]

has a bulge on the rows of indices $2k+2$ through $4k+1$ which can be chased away by a sequence of $O(n/k)$ transformations having the same structure as above. Note that the rank correction of the unitary matrix inside the brackets is never affected by these transformations so that, at the end of the process, we have unitarily reduced $A_{0}$ to the LFR format in Definition 2.1. Also the zeros in the last $k$ rows are preserved.

The cost analysis is rather standard for matrix algorithms based on chasing operations [4].

Step 1 requires to compute the economic QR decomposition of a matrix of size $(n+k)\times k$ and to multiply a unitary $k-$ Hessenberg matrix specified as product of $k$ unitary Hessenberg matrices by $k$ vectors of size $n+k$ . The total cost is $O(nk^{2})$ ops. 2. 2.

The cost of Step 2 is asymptotically the same. The construction of the factored representation of $\hat{Q}$ as well as the computation of $L_{1}$ and $Q_{1}$ can still be performed using $O(nk^{2})$ ops. 3. 3.

The dominant cost is the execution of Step 3. The zeroing of the sub-subdiagonal entries costs $O(n\frac{n}{k}k^{2})=O(n^{2}k)$ ops.

In the next section we provide algorithmic details and discuss the results of numerical experiments confirming the effectiveness and the robustness of our proposed approach.

5 Numerical Results

The structured Hessenberg reduction scheme described in the previous section has been implemented using MATLAB for numerical testing. The resulting algorithm basically amounts to manipulate chains of unitary Hessenberg matrices.

At step 1 of the structured Hessenberg reduction scheme we first compute the full QR factorization of the matrix $Y_{0}\in\mathbb{C}^{m\times k}$ . The matrix $Q_{0}^{H}$ turns out to be the product of $k$ unitary upper Hessenberg matrices. Then we have to incorporate the unitary matrix $\mathcal{S}:=I_{2k}-\hat{X}(1:2k,:)\hat{X}^{H}(:,1:2k)$ on the right into the factored representations of $Q_{0}^{H}$ and $R_{0}$ . The unitary $2k\times 2k$ matrix $\mathcal{S}$ can always be represented as the product of at most $k(2k-1)$ elementary unitary transformations of size $2\times 2$ . Once this factorization is computed, we have to add each of these single transformations, one by one, on the right to the factored representations of $Q_{0}^{H}$ and $R_{0}$ . This is accomplished by a sequence of turnover and fusion operations acting on the chains of elementary transformations in $Q_{0}^{H}$ and $R_{0}$ (see [23] for the detailed description of these operations on elementary transformations).

At the beginning of step 2 the matrix $\widehat{U}_{1}$ is a $2k$ -upper Hessenberg matrix, and is essentially determined by the product of two unitary $k$ -upper Hessenberg matrices that here we rename as $\widehat{U}_{1}=\widehat{P}\widehat{Q}$ . To reshape this factorization in the desired form in equation (3) we can apply $k$ times a reasoning similar to the one done in Lemma 4.1 to move each elementary transformation of $\widehat{Q}$ on the left. In this way we find $\widehat{P}\widehat{Q}=\widetilde{Q}\widetilde{P}$ where $\widetilde{Q}=\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\hat{Q}\end{array}\right]$ is the matrix appearing in (3). Since $\widehat{Q}$ is formed by $O(nk)$ elementary transformations the reshaping costs $O(nk^{2})$ ops. With a similar reasoning we can compute the representations of $Q_{1}$ and $L_{1}$ where $Q_{1}$ is $k$ -lower Hessenberg and $L_{1}=\begin{bmatrix}I_{k}&\\ &\hat{L}_{1}\end{bmatrix}$ , with $\hat{L}_{1}$ unitary $k$ -upper Hessenberg.

The third phase of the structured Hessenberg reduction scheme basically amounts to reduce the matrix $\widehat{U}_{2}=L_{1}\widetilde{Q}$ into a matrix of the form $\left[\begin{array}[]{c|c}I_{k}&\\ \hline\cr&\tilde{U}_{2}\end{array}\right]$ , with $\tilde{U}_{2}$ $n\times n$ unitary Hessenberg. To be specific assume that $L_{1}=L_{1,1}\cdots L_{1,k}$ and $\widetilde{Q}=\widetilde{Q}_{1}\cdots\widetilde{Q}_{k}$ , where $L_{1,j}$ and $\widetilde{Q}_{j}$ are unitary upper Hessenberg matrices with the leading principal submatrix of order $k$ equal to the identity matrix. The overall reduction process splits into $n$ intermediate steps. At each step the first active elementary transformations of $\widetilde{Q}_{k},\ldots,\widetilde{Q}_{1},L_{1,k},\ldots,L_{1,1}$ are annihilated (in this order). Each transformation is moved on the left by creating a bulge in the leftmost factor $Q_{1}$ . This bulge is removed by applying a similarity transformation.

Let us consider the first step. Let $L_{1,i}={\mathcal{G}}_{k+1}^{(i)}\cdots{\mathcal{G}}_{m-1}^{(i)}D_{m}^{(i)}$ denote the Schur parametrization of $L_{1,i}$ and similarly let $\widetilde{Q}_{i}={\mathcal{H}}_{k+1}^{(i)}\cdots{\mathcal{H}}_{m-1}^{(i)}E_{m}^{(i)}$ that of $\widetilde{Q}_{i}$ . At this step we move left the first elementary transformations of each factor of the product $L_{1}\widetilde{Q}$ , for example when moving the rotation ${\mathcal{H}}_{k+1}^{(k)}$ in front of $L_{1}$ the resulting transformation acts on rows $3k$ and $3k+1$ while some of the rotations in $L_{1}$ and $\tilde{Q}$ have changed. The final situation is as follows111As observed, we can use only a unitary diagonal matrix to keep track of all the diagonal contributions.

[TABLE]

where

[TABLE]

At this point we bring the bulge $B$ on the left of $Q_{1}$ in equation (4) obtaining

[TABLE]

where $\widehat{B}=\Gamma_{2k}\cdots\Gamma_{2}$ is the product of a sequence of elementary transformations in ascending order acting on rows $2:2k$ . The bulge $\widehat{B}$ is removed by chasing an elementary transformation at a time. For example to remove $\Gamma_{2k}$ we apply the similarity transformation $\Gamma_{2k}^{H}\widehat{B}\breve{Q}_{1}(+T_{0}W_{0}^{H}R_{0}P)P^{H}\,\Gamma_{2k}$ that will shift down the bulge of $2k$ positions. So $O(n/k)$ chasing step will be necessary to get rid of that first transformation. In this way the overall process is completed using $O(nk\cdot k\cdot n/k)=O(n^{2}k)$ ops. Note that the whole similarity transformation acts only on the first $n$ rows leaving untouched the null rows at the bottom of $\widehat{A}$ in equation (2).

Numerical experiments have been performed to confirm the computational properties of the proposed method. Among the three cases considered in Section 2 the last one, when the unitary part is block diagonal, is the most challenging since computing the starting LFR format costs $O(n^{2}k)$ vs the $O(nk^{2})$ flops sufficient for the first two cases. The CMV reduction of the input unitary diagonal plus rank $-k$ matrix $D+UV^{H}$ is computed using the algorithm presented in [17] which is fast and backward stable. Our tests focus on the numerical performance of the Hessenberg reduction scheme provided in the previous section given the factors $L,R$ and $Z$ satisfying Theorem 2.9. In the next tables we show the backward errors $\epsilon_{P}$ , $\epsilon_{B}$ and $\epsilon_{H}$ generated by our procedure. These errors are defined as follows:

$\epsilon_{P}$ is the error computed at the end of the first two preparatory steps. Given the matrix $A$ of size $n$ represented as in Theorem 3.2 we find the matrix $\widehat{A}$ of size $m=n+k$ obtained at the end of step 2. Denoting by $fl(\widehat{A})$ the computed matrix, the error is

[TABLE] 2. 2.

$\epsilon_{B}$ is the classical *backward * error generated in the final step given by

[TABLE]

where $H$ is the matrix computed by multiplying all the factors obtained at the end of the third step, and $Q$ is the product of the unitary transformations acting by similarity on the left and on the right of the matrix $fl(\widehat{A})$ in the Hessenberg reduction phase. 3. 3.

$\epsilon_{H}$ is used to measure the Hessenberg structure of the matrix $H$ . It is

[TABLE]

where ${\tt tril}(X,K)$ is the matrix formed by the elements on and below the $K$ -th diagonal of $X$ .

Next tables report these errors for different values of $n,k$ and $\|A\|_{2}$ .

The results of Table 1,2,3 and 4 show that the proposed algorithm is numerically backward stable.

In order to confirm the cost analysis of the algorithm we have also performed experiments taking fixed the size of the matrix. For matrices of size $512$ with $k$ varying from 2 to 16 we obtain that the measures of elapsed time $t_{k}$ satisfy

[TABLE]

This illustrates the linear growth of the cost with respect to $k$ , the size of the perturbation.

6 Conclusions and Future Work

In this paper we have presented a novel algorithm for the reduction in Hessenberg form of a unitary diagonal plus rank $-k$ matrix. By exploiting the rank structure of the input matrix this algorithm achieves computational efficiency both with respect to the size of the matrix and the size of the perturbation as well as numerical accuracy. The algorithm complemented with the structured QR iteration described in [8] yields a fast and accurate eigensolver for unitary plus low rank matrices.

Bibliography25

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] A. Amiraslani, R. M. Corless, and P. Lancaster , Linearization of matrix polynomials expressed in polynomial bases , IMA J. Numer. Anal., 29 (2009), pp. 141–157, https://doi.org/10.1093/imanum/drm 051 , http://dx.doi.org/10.1093/imanum/drm 051 . · doi ↗
2[2] P. Arbenz and G. H. Golub , On the spectral decomposition of Hermitian matrices modified by low rank perturbations with applications , SIAM J. Matrix Anal. Appl., 9 (1988), pp. 40–58, https://doi.org/10.1137/0609004 , http://dx.doi.org/10.1137/0609004 . · doi ↗
3[3] Y. Arlinskiĭ , Conservative discrete time-invariant systems and block operator CMV matrices , Methods Funct. Anal. Topology, 15 (2009), pp. 201–236.
4[4] J. Aurentz, T. Mach, L. Robol, R. Vandebril, and D. S. Watkins , Core-chasing algorithms for the eigenvalue problem , Fundamentals of Algorithms, SIAM, 2018.
5[5] J. Aurentz, T. Mach, L. Robol, R. Vandebril, and D. S. Watkins , Fast and backward stable computation of eigenvalues and eigenvectors of matrix polynomials , Math. Comp., 88 (2019), pp. 313–347, https://doi.org/10.1090/mcom/3338 , https://doi.org/10.1090/mcom/3338 . · doi ↗
6[6] A. P. Austin, P. Kravanja, and L. N. Trefethen , Numerical algorithms based on analytic function values at roots of unity , SIAM J. Numer. Anal., 52 (2014), pp. 1795–1821, https://doi.org/10.1137/130931035 , https://doi.org/10.1137/130931035 . · doi ↗
7[7] R. Bevilacqua, G. M. Del Corso, and L. Gemignani , A QR based approach for the nonlinear eigenvalue problem , Rendiconti Sem. Mat. Univ. Pol. Torino, 76 (2018), pp. 77–87.
8[8] R. Bevilacqua, G. M. D. Corso, and L. Gemignani , Fast QR iterations for unitary plus low rank matrices , 2018, https://arxiv.org/abs/ar Xiv:1810.02708 .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Efficient Reduction of Compressed Unitary plus Low-rank Matrices to Hessenberg form††thanks: The research of the last two authors was partially supported by GNCS project “Tecniche innovative per

Abstract

keywords:

1 Introduction

2 The LFRLFRLFR Format of Unitary plus Rank-kkk Matrices

Definition 2.1**.**

2.1 Small Rank Modifications of Unitary Block CMV Matrices

Definition 2.2** (CMV shape).**

Lemma 2.3** (Nullity Theorem).**

Lemma 2.4** (CMV factorization).**

Proof 2.5**.**

Theorem 2.6**.**

2.2 Small Rank Modifications of Unitary Block Hessenberg Matrices

Theorem 2.7**.**

2.3 Small Rank Modifications of Unitary Block Diagonal Matrices

Theorem 2.8**.**

Theorem 2.9**.**

3 Factored Representations of Hessenberg Matrices

Definition 3.1**.**

Theorem 3.2**.**

Proof 3.3**.**

Theorem 3.4**.**

Proof 3.5**.**

4 The Bulge Chasing Algorithm

Lemma 4.1**.**

Proof 4.2**.**

5 Numerical Results

6 Conclusions and Future Work

2 The $LFR$ Format of Unitary plus Rank- $k$ Matrices

Definition 2.1.

Definition 2.2 (CMV shape).

Lemma 2.3 (Nullity Theorem).

Lemma 2.4 (CMV factorization).

Proof 2.5.

Theorem 2.6.

Theorem 2.7.

Theorem 2.8.

Theorem 2.9.

Definition 3.1.

Theorem 3.2.

Proof 3.3.

Theorem 3.4.

Proof 3.5.

Lemma 4.1.

Proof 4.2.