Kernelized Locality-Sensitive Hashing for Semi-Supervised Agglomerative Clustering
Boyi Xie, Shuheng Zheng

TL;DR
This paper introduces a fast semi-supervised agglomerative clustering method that uses kernelized LSH to approximate distances, significantly reducing computation time while leveraging labeled data for improved accuracy.
Contribution
The paper presents a novel kernelized LSH-based approach for scalable semi-supervised agglomerative clustering that combines efficient hashing with metric learning.
Findings
Reduces clustering computation time significantly
Achieves competitive precision and recall with less computation
Effectively incorporates labeled data for improved clustering
Abstract
Large scale agglomerative clustering is hindered by computational burdens. We propose a novel scheme where exact inter-instance distance calculation is replaced by the Hamming distance between Kernelized Locality-Sensitive Hashing (KLSH) hashed values. This results in a method that drastically decreases computation time. Additionally, we take advantage of certain labeled data points via distance metric learning to achieve a competitive precision and recall comparing to K-Means but in much less computation time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Face and Expression Recognition · Video Surveillance and Tracking Methods
