Properties of Minimizing Entropy

Xu Ji; Lena Nehale-Ezzine; Maksym Korablyov

arXiv:2112.03143·cs.LG·December 7, 2021

Properties of Minimizing Entropy

Xu Ji, Lena Nehale-Ezzine, Maksym Korablyov

PDF

Open Access

TL;DR

This paper explores the relationship between entropy and cardinality in data representations, proposing a new measure called expected cardinality, and demonstrating how minimizing entropy reduces this measure, thereby improving data compactness.

Contribution

It introduces expected cardinality as a new measure of compactness and shows that minimizing entropy also minimizes expected cardinality, bridging the gap between distribution-sensitive and insensitive measures.

Findings

01

Minimizing entropy reduces expected cardinality.

02

Expected cardinality discounts states with negligible probability.

03

Entropy and cardinality are related measures of data compactness.

Abstract

Compact data representations are one approach for improving generalization of learned functions. We explicitly illustrate the relationship between entropy and cardinality, both measures of compactness, including how gradient descent on the former reduces the latter. Whereas entropy is distribution sensitive, cardinality is not. We propose a third compactness measure that is a compromise between the two: expected cardinality, or the expected number of unique states in any finite number of draws, which is more meaningful than standard cardinality as it discounts states with negligible probability mass. We show that minimizing entropy also minimizes expected cardinality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Neural Networks and Applications · Reinforcement Learning in Robotics