Iterative Refinement and Oversampling for Low Rank Approximation
Victor Y. Pan, Qi Luan, and Soo Go

TL;DR
This paper extends iterative refinement to low rank matrix approximation, linking it to oversampling techniques, and demonstrates that using sparse sketches enables very fast, low-cost approximations with near-optimal accuracy for certain matrices.
Contribution
It introduces a novel iterative refinement approach for low rank approximation, connecting it to oversampling, and shows how sparse sketch matrices enable efficient, very low rank approximations with improved accuracy.
Findings
Sparse sketch matrices enable sublinear cost low rank approximation.
Oversampling improves accuracy to near-optimal levels.
The method is effective for matrices with fast decaying singular values.
Abstract
Iterative refinement is particularly popular for numerical solution of linear systems of equations. We extend it to Low Rank Approximation of a matrix (LRA) and observe close link of the resulting algorithm to oversampling techniques, commonly used in randomized LRA algorithms. We elaborate upon this link and revisit oversampling and some efficient randomized LRA algorithms. Applied with sparse sketch matrices they run significantly faster and in particular yield Very Low Rank Approximation (VLRA) at sublinear cost, using much fewer scalars and flops than the input matrix has entries. This is achieved at the price of deterioration of output accuracy, but according to our formal and empirical study subsequent oversampling improves accuracy to near-optimal level under the spectral norm for a large sub-class of matrices with fast decaying spectra of singular values.
| Input Matrix | ||||
|---|---|---|---|---|
| Gravity () | 1.000 1.036E-05 | 1.000 7.815E-06 | 1.000 7.888E-06 | 1.000 6.683E-06 |
| SLP () | 1.970 4.083 | 1.000 0.002 | 1.000 1.018E-07 | 1.000 1.362E-10 |
| Fast Decay () | 1.000 3.086E-12 | 1.000 3.590E-16 | 1.000 2.470E-16 | 1.000 2.617E-16 |
| Slow Decay () | 1.000 3.604E-05 | 1.000 1.244E-06 | 1.000 1.543E-07 | 1.000 3.754E-08 |
| Input Matrix | Multiplier Type | 1st Itr. | 2nd Itr. | 3rd Itr. | ||
|---|---|---|---|---|---|---|
| before | after | before | after | |||
| Fast Decay | Abr. Had. | 3.1550E+00 | 2.9872E-11 | 1.0000E+00 | 2.4894E-11 | 1.0000E+00 |
| Gaussian | 3.1202E+00 | 1.5322E-11 | 1.0000E+00 | 2.4075E-11 | 1.0000E+00 | |
| Slow Decay | Abr. Had. | 5.0468E+00 | 3.3300E-02 | 1.0003E+00 | 3.1365E-02 | 1.0001E+00 |
| Gaussian | 5.0755E+00 | 3.1995E-02 | 1.0002E+00 | 3.1017E-02 | 1.0001E+00 | |
| Shaw | Abr. Had. | 2.8820E+01 | 5.3133E-01 | 1.0983E+00 | 5.4956E-01 | 1.1225E+00 |
| Gaussian | 1.8235E+01 | 6.2568E-01 | 1.1517E+00 | 5.5411E-01 | 1.1189E+00 | |
| Gravity | Abr. Had. | 1.5762E+01 | 7.6046E-03 | 1.0000E+00 | 8.4108E-03 | 1.0000E+00 |
| Gaussian | 1.2917E+01 | 9.7073E-03 | 1.0000E+00 | 8.7926E-03 | 1.0000E+00 | |
| SLP | Abr. Had. | 1.0931E+02 | 3.3201E-02 | 1.0014E+00 | 1.0484E-02 | 1.0000E+00 |
| Gaussian | 5.2205E+00 | 4.1354E-03 | 1.0000E+00 | 4.6096E-03 | 1.0000E+00 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Advanced Adaptive Filtering Techniques · Blind Source Separation Techniques
Superfast Escalators for Near-Optimal
Low Rank Approximation of a Matrix
Victor Y. Pan
Soo Go*[1],[a], Qi Luan[1],[b], and Victor Y. Pan[1,2],[c]*
[1] Ph.D. Programs in Computer Science and Mathematics
The Graduate Center of the City University of New York
New York, NY 10036 USA
[2] Department of Computer Science
Lehman College of the City University of New York
Bronx, NY 10468 USA
[a] [email protected], [b] [email protected]
http://comet.lehman.cuny.edu/vpan/
Abstract
A superfast (aka sublinear cost) algorithm only accesses a small fraction of all entries of a matrix. We seek such algorithms for Low Rank Approximation (LRA) of a matrix, but for some matrix families any such algorithm at best halves the large error of the trivial approximation by the matrix filled with 0s. Nevertheless, our simple superfast escalating algorithms compute near-optimal LRAs of a large class of real world matrices according to our analysis and numerical tests.
Key Words:
near-optimal algorithms, superfast (sublinear cost) algorithms, low rank approximation, iterative refinement
2020 Math. Subject Classification:
65F55, 65N75, 65Y20, 68Q25, 68W20
1 Introduction
1.1 Superfast algorithms by means of LRA:
benefits and an obstacle
In modern computations with matrices representing Big Data it is highly desirable to apply algorithms that run superfast, aka in sublinear time, that is, use much fewer memory cells and flops than an input matrix has entries.111Flop is a floating point arithmetic operation. Superfast algorithms (aka running in sublinear arithmetic time) are studied extensively in for Toeplitz, Hankel, Vandermonde, Cauchy, and other structured matrices having small displacement rank [22], [24], and we extend this concept to other matrices. Quite frequently such matrices admit their low rank approximations (LRAs), with which we can operate superfast. By virtue of Eckart-Young-Mirsky theorem, optimal rank- approximation of a matrix under both spectral and Frobenius matrix norms can be readily obtained from its SVD, but computation of SVD is costly and not superfast (see Sec. 2 for background). Can we still compute a close LRA superfast if a matrix admits one? Unfortunately, any superfast algorithm at best halves the large error of the trivial approximation by the matrix filled with 0s, even for an input family of matrices having size and rank at most 1 (see Appendix A or, say, [27, 28, 30, 29]). If randomization is applied, such an algorithm still at best halves the error of the trivial approximation with a probability not close to 0.
1.2 Recent and new progress
This is not the end of the story, however, because nontrivial superfast algorithms of [17, 2, 11, 13] compute near-optimal LRAs of matrices of some large and important classes. For surprisingly simple extension of this impressive progress to yet another class of matrices, we propose superfast algorithms that (i) first compute a crude initial LRA and (ii) then escalate it to near-optimal level.
1.3 Superfast computation of a crude LRA
The algorithm of [37] outputs a near-optimal LRA with a high probability (whp) under random sampling performed by means of pre- and/or post-multiplication of an input matrix by Gaussian (normal) random test matrices and , respectively.222We call a random matrix Gaussian if it is filled with independent standard Gaussian (normal) random variables. Moreover, the algorithm is superfast except for this bottleneck multiplication stage.
We give up formal support for near-optimality to run the algorithm superfast; namely, we propose to replace its test matrices and with sparse multipliers. Then the output approximation errors tend to grow (see [12]) but still tend to stay reasonably low according to our study where we use abridged SRHT matrices for the sparse multipliers (see Secs. 3 and 6 and Appendix C).
1.4 Escalation to superfast near-optimal LRAs
Our superfast algorithms escalate crude LRAs to near-optimal LRAs in two ways – for two large classes of matrices, specified implicitly in Secs. 4 – 6, in terms of the spectra of their singular values and/or by the results of our numerical tests.
Further advance of our pioneering algorithms and of their formal and empirical support are natural challenges as well as further extension of our approach to tensor decomposition.
1.5 Related works
In our choice of abridged SRHT matrices for sparse multipliers we follow [27, 28, 29, 30], which proved some crude but reasonable probabilistic upper bounds on output errors of some superfast LRA algorithms under randomization in the input matrix space defined by a fixed probability structure in that space.333These papers continued the study in [32, 31, 34], which introduced a probability structure in the space of input matrices to motivate application of Gaussian elimination without pivoting.
In Sec. 5 we extend to LRA the classical iterative refinement techniques, which are popular for the solution of a linear system of equations (see [36, Secs. 3.3.4 and 4.2.5], [9, Ch. 12 and Sec. 20.5], [3], [7, Secs. 3.5.3 and 5.3.8], [1, Secs. 1.4.6, 2.1.4, and 2.3.7]) as well as for various other linear and nonlinear matrix computations (see [36, page 223 and the references on page 225]). Our extension is not straightforward – we incorporate the re-compression techniques of [19, 20, 25, 33, 26, 23] and [22, Secs. 6.3-6.5].
We know of no other attempts of devising iterative refinement algorithms for LRA except for the paper [14], whose algorithm uses no crude auxiliary LRA but instead bounds the Principal Angle Distances between singular spaces of and its LRAs and incorporates randomization based on the leverage scores of [4].
1.6 Organization
of our paper
In the next section we recall some background material. We specify sample superfast computation of a crude LRA in Sec. 3 and present our superfast LRA escalators in Secs. 4 and 5. We cover our test results in Sec. 6 and devote Sec. 7 to conclusions. In Appendix A we describe some small families of matrices for which any superfast LRA algorithm fails. In Appendix B we recall optimal and nearly optimal algorithms for LRA of a matrix based on computing its top SVD and rank-revealing QRP factorization, respectively. In Appendix C we generate abridged SRHT matrices, which we use for devising superfast LRA algorithms by means of random sampling.
2 Background
denotes the spectral norm of a matrix , to which we restrict our study.
Definition 2.1**.**
An matrix has -rank* at most for a fixed tolerance if and only if admits its approximation within an error norm by a matrix of rank at most or equivalently if and only if there exist three matrices , and such that*
[TABLE]
A matrix admits its LRA if and only if it has -rank where and are small in context.
For a small positive one can regard -rank as numerical rank (cf. [7, pages 275-276]), being aware that -rank is numerically unstable where multiple singular values are close to .
A 2-factor LRA of of (2.1) can be generalized to a 3-factor LRA,
[TABLE]
[TABLE]
denotes the th largest singular value of a matrix , .
For a matrix define its -top SVD where
- •
is the diagonal matrix of the top (largest) singular values of ,
- •
and are two matrices of the associated top singular spaces of , having orthonormal columns,
- •
we call the matrix the rank- truncation of the SVD of and can define that matrix by setting to 0 all singular values of except for the largest ones.
is optimal rank- approximation of under both spectral and Frobenius norms by virtue of the Eckart-Young-Mirsky theorem (cf. [35, 5, 16], [7, Thm. 2.4.8]); we only state it under the spectral norm:
Theorem 2.1**.**
.
Theorem 2.2**.**
[7, Cor. 8.6.2].* For a pair of matrices and it holds that*
[TABLE]
denotes the Moore–Penrose pseudo inverse of .
Hereafter we write to show that exceeds greatly, in context, and let ; otherwise, if , we can extend our study from to , the transpose of , because .
For an matrix and four integers , and such that , consider the following Assumptions:
(a) a fixed algorithm computes a (crude but) reasonably close rank- approximation of superfast (see Secs. 1.3 and 3) and
(b) and that is, the singular values decay strongly as increases from to .
3 Superfast computation of a crude LRA
The following algorithm supports Assumption (a) by using abridged SRHT matrices of Appendix C for sparse multipliers.
Algorithm 3.1**.**
Input:
Three integers , , and satisfying ; an matrix , given explicitly or implicitly.
Initialization:
- 1.
Generate a pair of independent abridged SRHT matrices and of length 3 and sizes and , respectively (see Appendix C); 2. 2.
Compute the sketches and .
Computations:
- 1.
Compute matrix as the Q factor of the thin QR factorization444Recall the uniqueness of the thin QR factorization of an matrix where an upper triangular factor has positive diagonal entries [7, Thm. 5.3.2].* of .* 2. 2.
Compute the matrices and of the thin QR factorization of . 3. 3.
Compute and output the matrix , an LRA of .
4 LRA via a single escalation step
Algorithm 4.1: LRA via a single escalation step.
INPUT: An matrix with , two integers and , and Algorithm , such that above Assumptions (a) and (b) hold.
COMPUTATIONS:
-
Apply Algorithm .
-
Compute and output a rank- truncation of its output matrix .
Assumption (b) implies that
[TABLE]
or, in words, even a crude rank- approximation lies much closer to than its optimal rank- approximation .
Algorithm computes superfast under Assumption (a). If rank is low, then Alg. 3.1 is superfast at stage 2 as well (see Appendix B).
Theorem 4.1**.**
For the error matrix of Alg. 4.1 and of (4.1) it holds that (cf. Thm. 2.1).
Proof.
Notice that and so
[TABLE]
Eqn. (4.1) and Thm. 2.2 together imply that Hence
[TABLE]
Combine estimates (4.2) and (4.3). ∎
Thm. 4.1 implies that for of (4.1). Hence under the spectral norm the rank- approximation of , output by Alg. 4.1, is optimal up to the term , that is, near-optimal under (4.1).
5 Iterative refinement of an LRA with recursive re-compression
5.1 Introductory comments
In our alternative LRA algorithms we rely solely on Assumption (a) and assume no a priori information about the spectrum of the singular values of the matrix , although in view of the Eckart-Young-Mirsky theorem the efficiency of the algorithm should depend on that spectrum.
We propose to apply Algorithm recursively (for a certain policy of recursive selection of integers ) to the error matrices output at the previous recursive steps rather than to . In other words, we escalate a crude LRA towards optimal LRA recursively – by means of extending to LRA the popular technique of iterative refinement.
To get progress in refinement we apply the customary mixed precision recipe, that is, increase (typically double) the computational precision at all subtraction stages. In our superfast iterative refinement of LRA we only compute sketches and/or for abridged SRHT multipliers and/or (or for some other tall, skinny, and sparse matrices and/or ) rather than all entries of error matrices .
To extend the known refinement techniques to LRA, we also apply mixed compression recipe – by incorporating the re-compression techniques of [19, 20, 25, 33, 26, 23] and [22, Secs. 6.3-6.5]). Namely, in refinement iteration we approximate by the sums of all rank- approximations of the error matrices computed so far, beginning with the initial approximation of at the first iteration. In the th iteration the sum can have rank as large as ; its compression to optimal rank- approximation can be costly for large , but we periodically compress the partial sums to their rank- approximations keeping bounded the number of rank- terms in every sum, e.g., just two rank- terms.
Using sketches and compression complicates formal support for iterative refinement LRA algorithms for LRA, but empirically they handle a wider class of inputs than Alg. 4.1.
5.2 Two refinement algorithms
Next we specify two variants of iterative refinement of LRA.
Algorithm 5.1: ITERATIVE REFINEMENT 1.
INPUT: an matrix , an integer such that , and a black box algorithm that for a positive integer and an matrix outputs a crude but reasonably close rank- approximation of .
INITIALIZATION: Write and ; fix integers and for such that and for .
COMPUTATIONS: Do for :
-
,
-
,
-
.
OUTPUT: .
Next we update the error matrix as follows: . This should counter cancellation of significant digits in the subtraction .
Algorithm 5.2: ITERATIVE REFINEMENT 2.
INPUT and INITIALIZATION: as in Alg. 5.1.
COMPUTATIONS: Do for :
-
,
-
.
OUTPUT: .
Remark 5.1**.**
We avoid -fold memory increase in the output summation of Alg. 5.2 by means of recursive truncation. We can just write , for , or, more generally, can recursively combine summation and periodic truncation with a fixed period , , as follows: , for , for (in particular, ), .
Remark 5.2**.**
Algs. 5.1 and 5.2 stop in iterations, where one can choose the integer parameter based on the previous record of convergence of the output matrices to ; our tests of Sec. 6 have consistently succeeded in two iterations. For an alternative heuristic stopping criterion, we can stop iterations where a fixed norm (say, 1-norm or -norm) of the matrices and decreases below a fixed tolerance value for sufficiently many random choices of sparse multipliers and . Optimization of stopping criteria for the refinement algorithms as well as their formal support and cost estimates are research challenges. For some test results of Alg. 5.1 see Sec. 6.
6 Numerical experiments
In this section we present the test results for our LRA algorithms on inputs made up of synthetic and real-world data with various spectra. We implemented our algorithm in Python, running it on a 64bit MacOS machine with 16GB Memory. We called scipy.linalg for numerical linear algebra routines such as QR factorization with pivoting, Moore-Penrose matrix inversion, and SVD.
6.1 Input Matrices
Real-world Input Matrices: We used three input matrices coming from different applications.
The matrix Shaw has target rank and comes from a one-dimensional image restoration model problem.
The matrix Gravity has target rank and comes from a one-dimensional gravity surveying model problem.
Both Shaw and Gravity are dense real matrices having low numerical rank; we padded them with 0s to increase their size to . They represent discretization of Integral Equations, provided in the built-in problems of the Regularization Tools555See http://www.math.sjsu.edu/singular/matrices and http://www.imm.dtu.dk/$\sim$pcha/Regutools For more details see Ch. 4 of http://www.imm.dtu.dk/$\sim$pcha/Regutools/RTv4manual.pdf .
The third input matrix is generated from the discretization of a single layer potential (SLP) operator (see [10, Sec. 7.1] for more details). It has size and the target rank .
We display the distribution of the top 50 singular values of these matrices in Fig. 1.
Synthetic Input Matrices: We used random synthetic input matrices of two kinds – with fast and slowly decaying spectra. In both cases we generated these matrices as the products , where and were the matrices of the left and right singular vectors of a Gaussian random matrix, respectively.
By letting , where for , for , and for , we generated input matrices with fast decaying spectra and target rank .
By letting , where for , and for , we generated input matrices with slowly decaying spectra and target rank .
6.2 Test results for Alg. 4.1
Table 6.1 displays the relative error
[TABLE]
for an input matrix , a target rank , and the -truncation of and of (cf. Alg. 4.1). We tested matrices for and let , , , and , to see the impact of increasing .
We repeated the tests 100 times per test set, that is, per triple of , , and an input matrix type. Table 6.1 shows the mean and standard deviation of the 100 relative errors.
For our input classes, the relative error bound usually stayed near the optimal value 1.000 already where we choose , and the errors consistently decreased towards 1.000 as increased (see Table 6.1). Alg. 4.1, however, failed on the matrix class Shaw for , which could be predicted because the spectrum of singular values of these matrices is flat for (decreasing roughly from to between and ). The test results were much more favorable for and .
6.3 Test results for iterative refinement of LRA
Our Table 6.2 displays the relative error
[TABLE]
for an input matrix , the approximation output in the -th iteration of Alg. 5.1, a target rank , and the -truncation of SVD of .
We tested Alg. 5.1 for iterative refinement of an LRA where we incorporated Alg. 3.1 as Algorithm , applied recursively for superfast computation of crude LRA and using abridged SRHT multipliers and .
For comparison, we have also tested Alg. 5.1 incorporating as Algorithm the slower algorithm of [37] with Gaussian random multipliers of the same sizes.
To control rank growth of the auxiliary LRAs computed by Algorithm , we applied them for and for all .
For each input matrix and each type of multipliers, we run Alg. 5.1 one hundred (100) times with independent random choices of multipliers and . We terminated each run in three refinement iterations, even though the approximations were always quite accurate already in two iterations. We recorded the mean relative error ratio of the approximation before and after the truncation for every iteration step in Table 6.2. With our choice of subalgorithm and multipliers, the rank of approximation before truncation could potentially be twice the target rank in the second and third iterations, and for this reason the relative error before truncation were much less than 1 in our tests.
Running LRA tests with both Gaussian random and abridged SRHT multipliers we observed similar output errors with only minor deterioration as the price for superfast acceleration.
Even in the tests with the matrices Shaw, the output relative error was close to 1, although not as close as in case of the other inputs, possibly because of the adverse impact of the clusters of trailing singular values in the spectra of matrices Shaw.
7 Conclusions
We proposed, tested, and briefly analyzed some simple but novel superfast LRA algorithms that are near-optimal for a large class of matrices.
Our study should motivate new effort for further advance in this direction. In particular, one can extend Alg. 3.1 by recalling some known algorithms that compute accurate LRA (e.g., those of [8, 18, 30, 14]) and accelerating them at the expense of some controlled deterioration of their output accuracy.
Another major research challenge is the extension of formal and experimental support for the known iterative refinement algorithms to new ones for LRA, in particular by combining our present algorithms and that of [14].
One can try to apply Alg. 4.1 to detection and verification of the gaps in the spectrum of the singular values of a matrix and to extend our whole approach to tensor decomposition.
Appendix
Appendix A Small families of matrices that are hard for superfast LRA
Any superfast algorithm fails to compute a close LRA of the following small families of matrices.
Example A.1**.**
Let denote an matrix of rank 1 filled with 0s except for its th entry filled with 1. The such matrices form a family of -matrices. We also include the null matrix filled with 0s into this family. Now fix any superfast algorithm; it does not access the th entry of its input matrices for some pair of and . Therefore it outputs the same approximation of the matrices and , with an undetected error at least 1/2. Arrive at the same conclusion by applying the same argument to the set of small-norm perturbations of the matrices of the above family and to the sums of the latter matrices with any fixed matrix of low rank. Furthermore, the same argument shows that superfast estimation of the norm of a matrix fails already on the same matrix family and that for both problems of LRA and norm estimation randomized superfast algorithms fail with error probability not close to 0.
Remark A.1**.**
How representative are the above matrix families of hard inputs? Clearly, they do not rule out superfast LRA computations if we restrict input matrices to be symmetric or diagonally dominant. Furthermore, the matrices of these families represent data singularities and are not relevant to a large class of input matrices representing regular processes or, say, smooth surfaces.
Appendix B Computation and approximation of -top SVD of an LRA
B.1 Computation of -top SVD
of an LRA
For completeness of our exposition we next recall the classical algorithm that computes -top SVD of an LRA superfast (cf., e.g., [19]).
Algorithm B.1**.**
[Computation of -top SVD of an LRA.]**
Input:
Four integers , , , and such that and two matrices and .
Output:
Three matrices (unitary), (diagonal), and (unitary) such that is a -top SVD of .
Computations:
- 1.
Compute SVDs and where , , and . 2. 2.
Compute matrices , , , and such that is SVD, , and and are the matrices of the top (largest) and the trailing (smallest) singular values of the matrix , respectively. Output the matrix . 3. 3.
Compute and output the matrices and made up of the first columns of the matrices and , respectively.
The algorithm involves flops; it is superfast if . Its correctness follows from equations , , and .
For any matrix triangle inequality implies that
[TABLE]
where by following [10] we use unified notation for both spectral norm and Frobenius norm of a matrix . Furthermore, , and so (cf. [37, Proposition 6.1])
[TABLE]
B.2 Approximation of -top SVD of an LRA
We can readily extend Alg. B.1 to the computation of -top SVD of a 3-factor LRA , but next we quite closely approximate -top SVD of an LRA (and similarly of a 3-factor LRA) based on rank-revealing QRP factorization of the factors and rather than on their SVD.
Algorithm B.2**.**
[Approximation of -top SVD of an LRA.]**
Input:
as in Alg. B.1.
Output:
Three matrices (unitary), (diagonal), and (unitary) such that the matrix approximates a -top SVD of .
Computations:
- 1.
Compute Rank-revealing QRP and P*′LQ′** factorization and where , , and are upper triangular matrices, and and are permutation matrices.* 2. 2.
Define leading principal submatrices of and of . Define matrix by first deleting the last rows of and then deleting the columns of the resulting matrix filled with 0s. Define matrix by first deleting the last columns of and then deleting the rows of the resulting matrix filled with 0s. Compute the nonsingular matrices , , and . 3. 3.
Compute SVD where , , and are matrices, and are unitary, and is a diagonal matrix filled with the singular values of . Output the matrix . 4. 4.
Compute and output the unitary matrices and .
The asymptotic complexity of the algorithm is flops, the same as for Alg. B.1, but the overhead constant in the ”O” notation is a little smaller because at stage 1 we substitute rank revealing QRP factorization for SVD. [6] computes QRP factorization that define LRA within the error bound from optimal for any , e.g., for . Alternatively one can use rank-revealing factorization of [21] instead of factorization – staying within the same arithmetic cost bound but increasing the error bound to .
Appendix C Generation of abridged SRHT matrices
In this section we specify the family of abridged SRHT matrices, used in Alg. 3.1 and defined by means of abridging the classical recursive processes of the generation of SRHT matrices, obtained from the dense matrices of Walsh-Hadamard transform for (cf. [15, Sec. 3.1]). The matrices are obtained in recursive steps, but we only perform steps, and the resulting abridged matrix can be multiplied by a vector by using additions and subtractions. SRHT matrices are obtained from the matrices by means of random sampling and scaling, which we also apply to the -abridged Hadamard transform matrices . They turn into for but are sparse fo . Namely, we write and then specify the following recursive process:
[TABLE]
For any fixed pair of and , each of the matrices is orthogonal up to scaling and has nonzero entries in every row and column; we can compute the product for an submatrix of by using less than additions and subtractions.
Now define the -Abridged Scaled and Permuted Hadamard matrices, , where is a random sampling matrix and is the matrix of random integer diagonal scaling. Each random permutation or scaling contributes up to random parameters. We can involve more random parameters by applying random permutation and scaling also to some or all intermediate matrices for .
The first columns of for form a -Abridged Subsampled Randomized Hadamard Transform (SRHT) matrix , which turns into a SRHT matrix for , where , is a target rank and is the oversampling parameter (cf. [10, Sec. 11]).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Björk, Numerical Methods in Matrix Computations , Springer, New York, 2015.
- 2[2] A. Bakshi, D. P. Woodruff: Sublinear Time Low-Rank Approximation of Distance Matrices, Procs. 32nd Intern. Conf. Neural Information Processing Systems (NIPS’18), 3786-3796, Montréal, Canada, 2018.
- 3[3] J. Demmel, Y. Hida, W. Kahan, X. S. Li, S. Mukherjee, E. J. Riedy, Error bounds from extra-precise iterative refinement, ACM Trans. Math. Softw. , 32, 2, 325-351, 2006.
- 4[4] P. Drineas, M.W. Mahoney, S. Muthukrishnan, Relative-error CUR Matrix Decompositions, SIAM Journal on Matrix Analysis and Applications , 30, 2 , 844–881, 2008.
- 5[5] C. Eckart, G. Young, The approximation of one matrix by another of lower rank. Psychometrika , 1 , 211-218, 1936. doi:10.1007/BF 02288367
- 6[6] M. Gu, S.C. Eisenstat, An Efficient Algorithm for Computing a Strong Rank Revealing QR Factorization, SIAM J. Sci. Comput. , 17 , 848-869, 1996.
- 7[7] G. H. Golub, C. F. Van Loan, Matrix Computations , The Johns Hopkins University Press, Baltimore, Maryland, 2013 (fourth edition).
- 8[8] S. Goreinov, I. Oseledets, D. Savostyanov, E. Tyrtyshnikov, N. Zamarashkin, How to Find a Good Submatrix, in Matrix Methods: Theory, Algorithms, Applications (dedicated to the Memory of Gene Golub, edited by V. Olshevsky and E. Tyrtyshnikov), pages 247–256, World Scientific Publishing, New Jersey, ISBN-13 978-981-283-601-4, ISBN-10-981-283-601-2, 2010.
