Probabilistic Multilevel Clustering via Composite Transportation Distance
Nhat Ho, Viet Huynh, Dinh Phung, Michael I. Jordan

TL;DR
This paper introduces a probabilistic multilevel clustering method based on composite transportation distance, leveraging Kullback-Leibler divergence, with efficient algorithms validated on synthetic and real datasets.
Contribution
It presents a novel probabilistic approach for multilevel clustering using composite transportation distance and develops scalable optimization algorithms for large datasets.
Findings
Efficient algorithms for multilevel clustering
Successful application to synthetic data
Effective on real-world datasets
Abstract
We propose a novel probabilistic approach to multilevel clustering problems based on composite transportation distance, which is a variant of transportation distance where the underlying metric is Kullback-Leibler divergence. Our method involves solving a joint optimization problem over spaces of probability measures to simultaneously discover grouping structures within groups and among groups. By exploiting the connection of our method to the problem of finding composite transportation barycenters, we develop fast and efficient optimization algorithms even for potentially large-scale multilevel datasets. Finally, we present experimental results with both synthetic and real data to demonstrate the efficiency and scalability of the proposed approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Complex Network Analysis Techniques · Advanced Clustering Algorithms Research
