TL;DR
UniPROT introduces a theoretically grounded, scalable method for selecting uniformly weighted prototypes using partial optimal transport, improving minority class representation in imbalanced classification tasks.
Contribution
It reformulates optimal transport constraints into a submodular objective, enabling efficient greedy algorithms with guarantees, and demonstrates improved minority class performance.
Findings
Enforces uniform prototype weights improves minority-class representation.
Achieves robust performance gains in language models under domain imbalance.
Provides a scalable, theoretically justified prototype selection method.
Abstract
Selecting prototypical examples from a source distribution to represent a target data distribution is a fundamental problem in machine learning. Existing subset selection methods often rely on implicit importance scores, which can be skewed towards majority classes and lead to low-quality prototypes for minority classes. We present , a novel subset selection framework that minimizes the optimal transport (OT) distance between a uniformly weighted prototypical distribution and the target distribution. While intuitive, this formulation leads to a cardinality-constrained maximization of a \emph{super-additive} objective, which is generally intractable to approximate efficiently. To address this, we propose a principled reformulation of the OT marginal constraints, yielding a partial optimal transport-based submodular objective. We prove that this reformulation enables a greedy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
