Quantizing Multiple Sources to a Common Cluster Center: An Asymptotic Analysis
Erdem Koyuncu

TL;DR
This paper analyzes the asymptotic behavior of quantizing concatenated noisy observations from multiple sources to a common cluster center, providing formulas, algorithms, and empirical validation for improved clustering performance.
Contribution
It introduces a novel asymptotic analysis for quantizing multiple noisy sources to a shared cluster center, along with an optimization algorithm and empirical validation.
Findings
Derived a formula for average distortion in the asymptotic regime.
Provided an algorithm for numerically optimizing cluster centers.
Showed improved clustering performance over naive methods.
Abstract
We consider quantizing an -dimensional sample, which is obtained by concatenating vectors from datasets of -dimensional vectors, to a -dimensional cluster center. The distortion measure is the weighted sum of th powers of the distances between the cluster center and the samples. For , one recovers the ordinary center based clustering formulation. The general case appears when one wishes to cluster a dataset through noisy observations of each of its members. We find a formula for the average distortion performance in the asymptotic regime where the number of cluster centers are large. We also provide an algorithm to numerically optimize the cluster centers and verify our analytical results on real and artificial datasets. In terms of faithfulness to the original (noiseless) dataset, our clustering approach outperforms the naive approach that relies on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Image and Signal Denoising Methods · Sparse and Compressive Sensing Techniques
