Learning the Positions in CountSketch
Simin Liu, Tianrui Liu, Ali Vakilian, Yulin Wan, David P. Woodruff

TL;DR
This paper introduces a novel learning algorithm for CountSketch that optimizes both the positions and values of non-zero entries, leading to improved accuracy in low rank approximation and other applications.
Contribution
It is the first to learn the positions of non-zero entries in CountSketch, enhancing performance over previous fixed-position methods.
Findings
Better accuracy in low rank approximation.
Improved results in k-means clustering.
Provably superior in the spiked covariance model and Zipfian matrices.
Abstract
We consider sketching algorithms which first quickly compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem, e.g., low rank approximation. In the learning-based sketching paradigm proposed by Indyk et al. [2019], the sketch matrix is found by choosing a random sparse matrix, e.g., the CountSketch, and then updating the values of the non-zero entries by running gradient descent on a training data set. Despite the growing body of work on this paradigm, a noticeable omission is that the locations of the non-zero entries of previous algorithms were fixed, and only their values were learned. In this work we propose the first learning algorithm that also optimizes the locations of the non-zero entries. We show this algorithm gives better accuracy for low rank approximation than previous work, and apply it to other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsData Mining Algorithms and Applications
