Distributed Information-Theoretic Clustering
Georg Pichler, Pablo Piantanida, Gerald Matz

TL;DR
This paper introduces a new multi-terminal source coding framework motivated by biclustering, analyzing information-theoretic limits and connections to related problems, with specific results for binary sources and an extension to the MD-CEO problem.
Contribution
It proposes a novel information-theoretic clustering setup, improves bounds for binary sources, and provides a tight characterization for a multiple description extension of the CEO problem.
Findings
Improved cardinality bounds for inner and outer bounds.
Thorough analysis of binary symmetric sources.
Tight single-letter characterization of the MD-CEO problem.
Abstract
We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sequences and , respectively. The goal is to find rate-limited encodings and that maximize the mutual information . We discuss connections of this problem with hypothesis testing against independence, pattern recognition, and the information bottleneck method. Improving previous cardinality bounds for the inner and outer bounds allows us to thoroughly study the special case of a binary symmetric source and to quantify the gap between the inner and the outer bound in this special case. Furthermore, we investigate a multiple description (MD) extension of the Chief Operating Officer (CEO) problem with mutual information constraint. Surprisingly, this MD-CEO problem permits a tight single-letter characterization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
