Contrastive Entropy Bounds for Density and Conditional Density Decomposition
Bo Hu, Jose C. Principe

TL;DR
This paper introduces a novel framework using Gaussian and Hilbert space decompositions to interpret neural network features, proposing new bounds and training objectives for autoencoders and mixture density networks that enhance diversity and interpretability.
Contribution
It develops a Gaussian operator-based approach to autoencoder and MDN training, introducing nuclear norm divergence and a Hilbert space bound to improve model interpretability and diversity.
Findings
Autoencoder objective maximizes the trace of a Gaussian operator.
Nuclear norm can be used as a divergence measure for MDNs.
Hilbert space bounds increase sample diversity and prevent trivial solutions.
Abstract
This paper studies the interpretability of neural network features from a Bayesian Gaussian view, where optimizing a cost is reaching a probabilistic bound; learning a model approximates a density that makes the bound tight and the cost optimal, often with a Gaussian mixture density. The two examples are Mixture Density Networks (MDNs) using the bound for the marginal and autoencoders using the conditional bound. It is a known result, not only for autoencoders, that minimizing the error between inputs and outputs maximizes the dependence between inputs and the middle. We use Hilbert space and decomposition to address cases where a multiple-output network produces multiple centers defining a Gaussian mixture. Our first finding is that an autoencoder's objective is equivalent to maximizing the trace of a Gaussian operator, the sum of eigenvalues under bases orthonormal w.r.t. the data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
