Coresets for Estimating Means and Mean Square Error with Limited Greedy Samples
Saeed Vahidian, Baharan Mirzasoleiman, Alexander Cloninger

TL;DR
This paper introduces a scalable greedy algorithm for coreset selection that efficiently estimates means and mean square errors in graph-structured data, outperforming existing methods in accuracy and speed.
Contribution
The paper presents a novel gradient ascent-based coreset selection algorithm that handles variable node costs and provides theoretical error bounds, with extensive empirical validation.
Findings
Faster empirical convergence than random and clustering methods
Effective in semi-supervised node classification and sensor placement
Outperforms current state-of-the-art algorithms
Abstract
In a number of situations, collecting a function value for every data point may be prohibitively expensive, and random sampling ignores any structure in the underlying data. We introduce a scalable optimization algorithm with no correction steps (in contrast to Frank-Wolfe and its variants), a variant of gradient ascent for coreset selection in graphs, that greedily selects a weighted subset of vertices that are deemed most important to sample. Our algorithm estimates the mean of the function by taking a weighted sum only at these vertices, and we provably bound the estimation error in terms of the location and weights of the selected vertices in the graph. In addition, we consider the case where nodes have different selection costs and provide bounds on the quality of the low-cost selected coresets. We demonstrate the benefits of our algorithm on the semi-supervised node classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Gaussian Processes and Bayesian Inference · Neural Networks and Applications
