TL;DR
This paper introduces a theoretically grounded, efficient algorithm for selecting prototypes and outliers with importance weights across diverse domains, enhancing data interpretability and insight extraction.
Contribution
It generalizes existing prototype selection methods to include importance weights, applicable to any symmetric positive definite kernel, with proven approximation guarantees.
Findings
Effective prototype and outlier selection demonstrated on retail, MNIST, and CDC health data.
Quantitative and qualitative validation confirms improved interpretability and insight.
Algorithm offers fast, theoretically supported solutions with broad applicability.
Abstract
Prototypical examples that best summarizes and compactly represents an underlying complex data distribution communicate meaningful insights to humans in domains where simple explanations are hard to extract. In this paper we present algorithms with strong theoretical guarantees to mine these data sets and select prototypes a.k.a. representatives that optimally describes them. Our work notably generalizes the recent work by Kim et al. (2016) where in addition to selecting prototypes, we also associate non-negative weights which are indicative of their importance. This extension provides a single coherent framework under which both prototypes and criticisms (i.e. outliers) can be found. Furthermore, our framework works for any symmetric positive definite kernel thus addressing one of the key open questions laid out in Kim et al. (2016). By establishing that our objective function enjoys a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
