Boltzmann-Shannon Index: A Geometric-Aware Measure of Clustering Balance
Emanuele Bossi, C. Tyler Diggans, Abd AlRahman R. AlMomani

TL;DR
The paper introduces the Boltzmann-Shannon Index (BSI), a novel geometric-aware measure for assessing clustering balance that improves over traditional metrics in various data scenarios.
Contribution
It presents the BSI as a new normalized measure combining geometric and frequency-based probabilities, applicable to diverse clustering and resource-allocation problems.
Findings
BSI effectively assesses clustering balance in synthetic and real datasets.
BSI detects inequality with high sensitivity in resource allocation scenarios.
BSI provides a smooth, optimization-friendly measure for symbolic representations of dynamical systems.
Abstract
The Boltzmann-Shannon Index (BSI) for clustered continuous data is introduced as a normalized measure that captures the relationship between geometry-based and frequency-based probability distributions defined over the clusters. In essence, it quantifies the similarity across densities of the clusters, which are defined by a given labeling. This labeling may originate from a geometric partitioning of the state space itself, but need not in general. We illustrate its performance on synthetic Gaussian mixtures, the Iris benchmark data set, and a high-imbalance resource-allocation scenario, showing that the BSI provides a coherent assessment in cases where traditional metrics give incomplete or misleading signals. Moreover, in the resource-allocation setting where equal density may be associated with a "fair" distribution, we demonstrate that BSI not only detects inequality with high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques · Complex Network Analysis Techniques
