Information Bottleneck Revisited: Posterior Probability Perspective with   Optimal Transport

Lingyi Chen; Shitong Wu; Wenhao Ye; Huihui Wu; Hao Wu; Wenyi Zhang; Bo; Bai; Yining Sun

arXiv:2308.11296·cs.IT·August 23, 2023

Information Bottleneck Revisited: Posterior Probability Perspective with Optimal Transport

Lingyi Chen, Shitong Wu, Wenhao Ye, Huihui Wu, Hao Wu, Wenyi Zhang, Bo, Bai, Yining Sun

PDF

Open Access

TL;DR

This paper introduces an entropy-regularized optimal transport approach to the information bottleneck problem, enabling more flexible solutions beyond the limitations of traditional algorithms, with demonstrated efficiency and effectiveness.

Contribution

It proposes a novel OT-based framework for IB from a posterior perspective, generalizing the Sinkhorn algorithm for improved optimization in machine learning.

Findings

01

OT model effectively solves IB problem

02

Generalized Sinkhorn algorithm improves computational efficiency

03

Numerical experiments validate approach's effectiveness

Abstract

Information bottleneck (IB) is a paradigm to extract information in one target random variable from another relevant random variable, which has aroused great interest due to its potential to explain deep neural networks in terms of information compression and prediction. Despite its great importance, finding the optimal bottleneck variable involves a difficult nonconvex optimization problem due to the nonconvexity of mutual information constraint. The Blahut-Arimoto algorithm and its variants provide an approach by considering its Lagrangian with fixed Lagrange multiplier. However, only the strictly concave IB curve can be fully obtained by the BA algorithm, which strongly limits its application in machine learning and related fields, as strict concavity cannot be guaranteed in those problems. To overcome the above difficulty, we derive an entropy regularized optimal transport (OT)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning · Markov Chains and Monte Carlo Methods