Partitioning the Sample Space for a More Precise Shannon Entropy Estimation

Gabriel F.A. Bastos; Jugurta Montalv\~ao

arXiv:2512.10133·cs.LG·December 12, 2025

Partitioning the Sample Space for a More Precise Shannon Entropy Estimation

Gabriel F.A. Bastos, Jugurta Montalv\~ao

PDF

Open Access

TL;DR

This paper introduces a new discrete Shannon entropy estimator that improves accuracy in small data regimes by partitioning the sample space and compensating for unseen outcomes, outperforming classical methods.

Contribution

The paper proposes a novel entropy estimation method leveraging sample space partitioning and missing mass estimation to reduce bias in undersampled data.

Findings

01

Outperforms classical estimators in undersampled regimes

02

Performs comparably with state-of-the-art estimators

03

Effective in small data scenarios

Abstract

Reliable data-driven estimation of Shannon entropy from small data sets, where the number of examples is potentially smaller than the number of possible outcomes, is a critical matter in several applications. In this paper, we introduce a discrete entropy estimator, where we use the decomposability property in combination with estimations of the missing mass and the number of unseen outcomes to compensate for the negative bias induced by them. Experimental results show that the proposed method outperforms some classical estimators in undersampled regimes, and performs comparably with some well-established state-of-the-art estimators.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Bayesian Methods and Mixture Models · Statistical Methods and Inference