Range-Net: A High Precision Streaming SVD for Big Data Applications
Gurpreet Singh, Soumyajit Gupta, Matthew Lease, Clint Dawson

TL;DR
Range-Net introduces a deterministic neural approach for streaming SVD that guarantees tail-energy bounds and significantly outperforms existing randomized methods in accuracy, with efficient memory usage suitable for big data applications.
Contribution
The paper presents Range-Net, a novel neural optimization method for streaming SVD that guarantees EYM tail-energy bounds and improves accuracy over randomized SVD schemes.
Findings
Range-Net achieves six orders of magnitude better accuracy than state-of-the-art streaming randomized SVD.
Range-Net's memory requirement depends only on feature dimension and rank, not sample size.
Theoretical guarantees ensure Range-Net's SVD factors satisfy EYM tail-energy lower bounds.
Abstract
In a Big Data setting computing the dominant SVD factors is restrictive due to the main memory requirements. Recently introduced streaming Randomized SVD schemes work under the restrictive assumption that the singular value spectrum of the data has exponential decay. This is seldom true for any practical data. Although these methods are claimed to be applicable to scientific computations due to associated tail-energy error bounds, the approximation errors in the singular vectors and values are high when the aforementioned assumption does not hold. Furthermore from a practical perspective, oversampling can still be memory intensive or worse can exceed the feature dimension of the data. To address these issues, we present Range-Net as an alternative to randomized SVD that satisfies the tail-energy lower bound given by Eckart-Young-Mirsky (EYM) theorem. Range-Net is a deterministic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data
