Symbolic Density Estimation for Discrete Distributions
Ziwen Liu, Meng Li

TL;DR
This paper presents symbolic density estimation (SDE), an unsupervised method that automatically derives closed-form discrete probability distributions using structured search and domain priors.
Contribution
It introduces SDE, a novel framework combining evolutionary search and validity-aware inference to recover interpretable discrete distributions automatically.
Findings
Successfully recovers all benchmark distribution families with accurate parameters.
Extends to complex distributions like zero inflation and finite mixtures.
Improves goodness-of-fit with interpretable mixture models in real data.
Abstract
Discrete probability laws underpin statistical modeling, yet the catalog of interpretable distributions has expanded only gradually through centuries of case-by-case mathematical derivations. We introduce symbolic density estimation (SDE), an unsupervised framework that automatically recovers closed-form probability mass functions by composing elementary analytic operations within a structured search space. Our method integrates domain-specific structural priors with evolutionary search and a validity-aware inference stage, and it extends to richer distribution families such as zero inflation and finite mixtures. To support systematic evaluation and future research, we contribute a benchmark dataset spanning a broad collection of commonly used discrete distributions. The proposed algorithm recovers all benchmark families with accurate parameter estimates. A real data application shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
