TL;DR
This paper introduces an advanced method for density estimation that reconstructs true distributions from noisy, incomplete, and heterogeneous data by extending Gaussian mixture models with uncertainty handling and missing data considerations.
Contribution
It generalizes Gaussian mixture models and EM algorithm to handle heteroskedastic uncertainties and missing data, with extensions for priors and optimization to improve accuracy.
Findings
Successfully applied to infer stellar velocity distributions from Hipparcos data.
Effectively reconstructs underlying distributions despite heteroskedastic noise.
Demonstrates robustness in handling incomplete and noisy observations.
Abstract
We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation--Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual -dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or "underlying" distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a "split-and-merge" procedure designed to avoid local maxima of the likelihood. We demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
