Bounds on mutual information of mixture data for classification tasks
Yijun Ding, Amit Ashok

TL;DR
This paper develops new bounds and estimators for the mutual information between mixture data and class labels in classification tasks, addressing computational challenges and providing tools for performance quantification.
Contribution
It introduces variational bounds and estimators based on pair-wise divergences, improving the analysis of mutual information in mixture-based classification data.
Findings
New bounds outperform Monte Carlo methods in accuracy.
Estimators effectively approximate mutual information.
Numerical simulations validate the bounds and estimators.
Abstract
The data for many classification problems, such as pattern and speech recognition, follow mixture distributions. To quantify the optimum performance for classification tasks, the Shannon mutual information is a natural information-theoretic metric, as it is directly related to the probability of error. The mutual information between mixture data and the class label does not have an analytical expression, nor any efficient computational algorithms. We introduce a variational upper bound, a lower bound, and three estimators, all employing pair-wise divergences between mixture components. We compare the new bounds and estimators with Monte Carlo stochastic sampling and bounds derived from entropy bounds. To conclude, we evaluate the performance of the bounds and estimators through numerical simulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
