TL;DR
The paper introduces the deterministic information bottleneck (DIB), an alternative to the IB that uses entropy instead of mutual information, leading to deterministic encoders and improved computational efficiency.
Contribution
It proposes the DIB as a new formulation that better captures compression, resulting in deterministic solutions and offering a method to interpolate between soft and hard clustering.
Findings
DIB outperforms IB in the DIB cost function.
DIB offers significant computational efficiency gains.
DIB produces deterministic encoders, unlike IB's stochastic encoders.
Abstract
Lossy compression and clustering fundamentally involve a decision about what features are relevant and which are not. The information bottleneck method (IB) by Tishby, Pereira, and Bialek formalized this notion as an information-theoretic optimization problem and proposed an optimal tradeoff between throwing away as many bits as possible, and selectively keeping those that are most important. In the IB, compression is measure my mutual information. Here, we introduce an alternative formulation that replaces mutual information with entropy, which we call the deterministic information bottleneck (DIB), that we argue better captures this notion of compression. As suggested by its name, the solution to the DIB problem turns out to be a deterministic encoder, or hard clustering, as opposed to the stochastic encoder, or soft clustering, that is optimal under the IB. We compare the IB and DIB…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
