Approximate Inference via Clustering
Qianqian Song

TL;DR
This paper introduces a clustering-based method to simplify Bayesian inference for large datasets by approximating the posterior with cluster centroids, making sampling more efficient while maintaining accuracy.
Contribution
It proposes a novel approximate Bayesian inference approach leveraging clustering to reduce complexity and provides theoretical and empirical validation of its effectiveness.
Findings
Approximate posterior closely matches the true posterior under certain conditions.
Clustering-based approximation significantly reduces sampling complexity.
Experimental results confirm the method's efficiency and accuracy.
Abstract
In recent years, large-scale Bayesian learning draws a great deal of attention. However, in big-data era, the amount of data we face is growing much faster than our ability to deal with it. Fortunately, it is observed that large-scale datasets usually own rich internal structure and is somewhat redundant. In this paper, we attempt to simplify the Bayesian posterior via exploiting this structure. Specifically, we restrict our interest to the so-called well-clustered datasets and construct an \emph{approximate posterior} according to the clustering information. Fortunately, the clustering structure can be efficiently obtained via a particular clustering algorithm. When constructing the approximate posterior, the data points in the same cluster are all replaced by the centroid of the cluster. As a result, the posterior can be significantly simplified. Theoretically, we show that under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Bayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models
