Sparse Density Trees and Lists: An Interpretable Alternative to High-Dimensional Histograms
Siong Thye Goh, Lesia Semenova, Cynthia Rudin

TL;DR
This paper introduces sparse, interpretable density estimation models using trees and lists for high-dimensional categorical data, improving visualization, sparsity, and accuracy over traditional histograms.
Contribution
The authors develop novel Bayesian sparse density models based on trees and lists, offering flexible control over model complexity and improved high-dimensional density estimation.
Findings
Models outperform traditional histograms in high dimensions.
The methods provide interpretable visualizations of density.
Application to crime analysis demonstrates practical utility.
Abstract
We present sparse tree-based and list-based density estimation methods for binary/categorical data. Our density estimation models are higher dimensional analogies to variable bin width histograms. In each leaf of the tree (or list), the density is constant, similar to the flat density within the bin of a histogram. Histograms, however, cannot easily be visualized in more than two dimensions, whereas our models can. The accuracy of histograms fades as dimensions increase, whereas our models have priors that help with generalization. Our models are sparse, unlike high-dimensional fixed-bin histograms. We present three generative modeling methods, where the first one allows the user to specify the preferred number of leaves in the tree within a Bayesian prior. The second method allows the user to specify the preferred number of branches within the prior. The third method returns density…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Analysis with R · Data Mining Algorithms and Applications
