NNK-Means: Data summarization using dictionary learning with non-negative kernel regression
Sarath Shekkizhar, Antonio Ortega

TL;DR
This paper introduces NNK-Means, a scalable data summarization method using dictionary learning with non-negative kernel regression, which produces representative atoms and improves class separation over traditional methods.
Contribution
The paper presents NNK-Means, a novel dictionary learning approach leveraging NNK graphs that enhances data summarization and class separation while maintaining scalability.
Findings
NNK-Means outperforms kMeans and kSVD in class separation.
NNK-Means has runtime complexity similar to kMeans.
The method effectively summarizes large datasets.
Abstract
An increasing number of systems are being designed by gathering significant amounts of data and then optimizing the system parameters directly using the obtained data. Often this is done without analyzing the dataset structure. As task complexity, data size, and parameters all increase to millions or even billions, data summarization is becoming a major challenge. In this work, we investigate data summarization via dictionary learning~(DL), leveraging the properties of recently introduced non-negative kernel regression (NNK) graphs. Our proposed NNK-Means, unlike previous DL techniques, such as kSVD, learns geometric dictionaries with atoms that are representative of the input data space. Experiments show that summarization using NNK-Means can provide better class separation compared to linear and kernel versions of kMeans and kSVD. Moreover, NNK-Means is scalable, with runtime…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Face and Expression Recognition · Machine Learning and ELM
