Properties of Minimizing Entropy
Xu Ji, Lena Nehale-Ezzine, Maksym Korablyov

TL;DR
This paper explores the relationship between entropy and cardinality in data representations, proposing a new measure called expected cardinality, and demonstrating how minimizing entropy reduces this measure, thereby improving data compactness.
Contribution
It introduces expected cardinality as a new measure of compactness and shows that minimizing entropy also minimizes expected cardinality, bridging the gap between distribution-sensitive and insensitive measures.
Findings
Minimizing entropy reduces expected cardinality.
Expected cardinality discounts states with negligible probability.
Entropy and cardinality are related measures of data compactness.
Abstract
Compact data representations are one approach for improving generalization of learned functions. We explicitly illustrate the relationship between entropy and cardinality, both measures of compactness, including how gradient descent on the former reduces the latter. Whereas entropy is distribution sensitive, cardinality is not. We propose a third compactness measure that is a compromise between the two: expected cardinality, or the expected number of unique states in any finite number of draws, which is more meaningful than standard cardinality as it discounts states with negligible probability mass. We show that minimizing entropy also minimizes expected cardinality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Neural Networks and Applications · Reinforcement Learning in Robotics
