Submodular Ground-Set Pruning: Monotone Tightness and a Non-Monotone Separation
Alan Kuhnle

TL;DR
This paper investigates containment pruning for large-scale submodular maximization, establishing tight bounds for monotone cases and novel algorithms for non-monotone objectives, with empirical benefits demonstrated on MaxCut and LLM tasks.
Contribution
It provides the first theoretical bounds for containment pruning, proves that non-monotone cases are easier than optimization, and introduces practical algorithms with empirical validation.
Findings
Greedy achieves the tight $1-1/e$ containment factor for monotone submodular functions.
Non-monotone objectives admit $1/2- ext{epsilon}$ containment algorithms, surpassing previous ratios.
Pruning significantly accelerates MaxCut solving and benefits LLM context selection.
Abstract
Large-scale subset selection asks for a small useful set of examples, features, sensors, seed users, or context passages from an enormous ground set. Submodular maximization is a canonical model for such diminishing-returns problems, but rapidly growing datasets make even linear-time algorithms ever costlier. We study \emph{containment pruning}: first reduce the ground set to a smaller core , then require that contain a near-optimal feasible solution for every downstream budget up to~. Prior work has formulated many heuristics, but the theoretical limits of this preprocessing problem are largely unknown. For monotone submodular objectives, we prove that is tight: greedy achieves this containment factor, and no algorithm can beat it even with a larger pruning budget. For non-monotone objectives, we give the first containment algorithms under cardinality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
