New Coresets for Projective Clustering and Applications
Murad Tukan, Xuan Wu, Samson Zhou, Vladimir Braverman, Dan, Feldman

TL;DR
This paper introduces novel coreset algorithms for $(j,k)$-projective clustering and general $M$-estimator regression, enabling efficient approximation of complex clustering and regression tasks in high-dimensional data.
Contribution
It presents the first polynomial-size $L_ abla$ coreset for $(j,k)$-projective clustering and the first strong coreset construction for various $M$-estimator regressions, advancing scalable data analysis.
Findings
Coreset algorithms significantly reduce data size for clustering and regression.
Experimental results demonstrate practical efficiency and accuracy.
New methods outperform existing approaches on real-world datasets.
Abstract
-projective clustering is the natural generalization of the family of -clustering and -subspace clustering problems. Given a set of points in , the goal is to find flats of dimension , i.e., affine subspaces, that best fit under a given distance measure. In this paper, we propose the first algorithm that returns an coreset of size polynomial in . Moreover, we give the first strong coreset construction for general -estimator regression. Specifically, we show that our construction provides efficient coreset constructions for Cauchy, Welsch, Huber, Geman-McClure, Tukey, , and Fair regression, as well as general concave and power-bounded loss functions. Finally, we provide experimental results based on real-world datasets, showing the efficacy of our approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Remote-Sensing Image Classification · Face and Expression Recognition
