Zero-Shot Coreset Selection via Iterative Subspace Sampling
Brent A. Griffin, Jacob Marks, Jason J. Corso

TL;DR
This paper introduces ZCore, a zero-shot coreset selection method that leverages foundation models to select representative unlabeled data for training, achieving state-of-the-art results without using labels or training on candidate data.
Contribution
ZCore is a novel zero-shot coreset selection approach that uses foundation models to interpret unlabeled data and select representative subsets without labels or prior training.
Findings
Outperforms state-of-the-art label-based methods at low data rates
Achieves 53.99% accuracy on ImageNet with only 10% training data
Reduces annotation and training costs for large datasets
Abstract
Deep learning increasingly relies on massive data with substantial storage, annotation, and training costs. To reduce costs, coreset selection finds a representative subset of data to train models while ideally performing on par with the full data training. To maximize performance, current state-of-the-art coreset methods select data using dataset-specific ground truth labels and training. However, these methodological requirements prevent selection at scale on real-world, unlabeled data. To that end, this paper addresses the selection of coresets that achieve state-of-the-art performance but without using any labels or training on candidate data. Instead, our solution, Zero-Shot Coreset Selection via Iterative Subspace Sampling (ZCore), uses previously-trained foundation models to generate zero-shot, high-dimensional embedding spaces to interpret unlabeled data. ZCore then iteratively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Face and Expression Recognition
MethodsCoresets
