TL;DR
This paper clarifies misconceptions about low-rank approximations of function-generated matrices, showing that certain classes can be approximated accurately with low rank independent of data dimension, with implications for big data and neural networks.
Contribution
The paper provides a theoretical explanation for when function-generated matrices can be approximated with low rank independent of the ambient dimension, extending to tensor-train formats.
Findings
Low-rank approximation is possible for specific function classes.
Rank depends logarithmically on matrix size, not dimension.
Results apply to big data matrices and neural network attention mechanisms.
Abstract
The article concerns low-rank approximation of matrices generated by sampling a smooth function of two -dimensional variables. We identify several misconceptions surrounding a claim that, for a specific class of analytic functions, such matrices admit accurate entrywise approximation of rank that is independent of and grows as -- colloquially known as ''big-data matrices are approximately low-rank''. We provide a theoretical explanation of the numerical results presented in support of this claim, describing three narrower classes of functions for which function-generated matrices can be approximated within an entrywise error of order with rank that is independent of the dimension : (i) functions of the inner product of the two variables, (ii) functions of the Euclidean distance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need
