Recovering Imbalanced Clusters via Gradient-Based Projection Pursuit
Martin Eppert, Satyaki Mukherjee, Debarghya Ghoshdastidar

TL;DR
This paper introduces a gradient-based projection pursuit method for recovering imbalanced clusters and specific distributions, analyzing its sample complexity and demonstrating superior performance on real-world datasets with limited samples.
Contribution
It presents a novel gradient-based approach for recovering imbalanced clusters, with theoretical analysis of sample complexity and applicability to real-world data.
Findings
Imbalanced clusters are easier to recover than balanced ones.
The method outperforms existing techniques with limited samples.
Theoretical analysis aligns with empirical results.
Abstract
Projection Pursuit is a classic exploratory technique for finding interesting projections of a dataset. We propose a method for recovering projections containing either Imbalanced Clusters or a Bernoulli-Rademacher distribution using a gradient-based technique to optimize the projection index. As sample complexity is a major limiting factor in Projection Pursuit, we analyze our algorithm's sample complexity within a Planted Vector setting where we can observe that Imbalanced Clusters can be recovered more easily than balanced ones. Additionally, we give a generalized result that works for a variety of data distributions and projection indices. We compare these results to computational lower bounds in the Low-Degree-Polynomial Framework. Finally, we experimentally evaluate our method's applicability to real-world data using FashionMNIST and the Human Activity Recognition Dataset, where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Face and Expression Recognition
