Pareto-optimal clustering with the primal deterministic information   bottleneck

Andrew K. Tan; Max Tegmark; Isaac L. Chuang

arXiv:2204.02489·cs.LG·July 29, 2022

Pareto-optimal clustering with the primal deterministic information bottleneck

Andrew K. Tan, Max Tegmark, Isaac L. Chuang

PDF

1 Repo

TL;DR

This paper explores the optimization of the Deterministic Information Bottleneck (DIB) for clustering, introducing a primal formulation and an algorithm to map the Pareto frontier, revealing insights into trade-offs and aiding model selection.

Contribution

It introduces the primal DIB problem, develops an algorithm for Pareto frontier mapping, and analyzes the properties of the DIB trade-off in clustering tasks.

Findings

01

Primal DIB yields a richer Pareto frontier than Lagrangian relaxation.

02

The Pareto frontier exhibits logarithmic sparsity.

03

The proposed algorithm scales polynomially despite the super-exponential search space.

Abstract

At the heart of both lossy compression and clustering is a trade-off between the fidelity and size of the learned representation. Our goal is to map out and study the Pareto frontier that quantifies this trade-off. We focus on the optimization of the Deterministic Information Bottleneck (DIB) objective over the space of hard clusterings. To this end, we introduce the primal DIB problem, which we show results in a much richer frontier than its previously studied Lagrangian relaxation when optimized over discrete search spaces. We present an algorithm for mapping out the Pareto frontier of the primal DIB trade-off that is also applicable to other two-objective clustering problems. We study general properties of the Pareto frontier, and we give both analytic and numerical evidence for logarithmic sparsity of the frontier in general. We provide evidence that our algorithm has polynomial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

andrewktan/pareto_dib
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.