GIST: Greedy Independent Set Thresholding for Max-Min Diversification with Submodular Utility
Matthew Fahrbach, Srikumar Ramalingam, Morteza Zadimoghaddam, Sara Ahmadian, Gui Citovsky, Giulia DeSalvo

TL;DR
This paper introduces GIST, a greedy algorithm for max-min diversification with submodular utility, providing theoretical guarantees and demonstrating superior performance in data sampling tasks like ImageNet.
Contribution
The paper proposes GIST, a novel greedy algorithm with approximation guarantees for max-min diversification with submodular utility, and empirically outperforms existing methods.
Findings
GIST achieves a 1/2-approximation guarantee for MDMS.
It is NP-hard to approximate MDMS within a factor of 0.5584.
GIST outperforms state-of-the-art benchmarks on ImageNet data sampling.
Abstract
This work studies a novel subset selection problem called max-min diversification with monotone submodular utility (), which has a wide range of applications in machine learning, e.g., data sampling and feature selection. Given a set of points in a metric space, the goal of is to maximize subject to a cardinality constraint , where is a monotone submodular function and is the max-min diversity objective. We propose the algorithm, which gives a -approximation guarantee for by approximating a series of maximum independent set problems with a bicriteria greedy algorithm. We also prove that it is NP-hard to approximate within a factor of . Finally, we show in our empirical study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Big Data Technologies and Applications · Data Management and Algorithms
MethodsSparse Evolutionary Training
