Histogram Meets Topic Model: Density Estimation by Mixture of Histograms
Hideaki Kim, Hiroshi Sawada

TL;DR
This paper introduces a Bayesian method combining histograms and topic models to improve density estimation for sparse data, automatically determining histogram parameters using Gibbs sampling.
Contribution
It presents a novel Bayesian approach that models density as a mixture of histograms with automatic bin selection, addressing sparsity issues in traditional histograms.
Findings
Performs well on synthetic data
Automatically determines number of bins and heights
Alleviates sparsity problems in density estimation
Abstract
The histogram method is a powerful non-parametric approach for estimating the probability density function of a continuous variable. But the construction of a histogram, compared to the parametric approaches, demands a large number of observations to capture the underlying density function. Thus it is not suitable for analyzing a sparse data set, a collection of units with a small size of data. In this paper, by employing the probabilistic topic model, we develop a novel Bayesian approach to alleviating the sparsity problem in the conventional histogram estimation. Our method estimates a unit's density function as a mixture of basis histograms, in which the number of bins for each basis, as well as their heights, is determined automatically. The estimation procedure is performed by using the fast and easy-to-implement collapsed Gibbs sampling. We apply the proposed method to synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Time Series Analysis and Forecasting · Gaussian Processes and Bayesian Inference
