Posterior Sampling of Probabilistic Word Embeddings
V\"ain\"o Yrj\"an\"ainen, Isac Bostr\"om, M{\aa}ns Magnusson, Johan Jonasson

TL;DR
This paper introduces a scalable Gibbs sampling method for Bayesian word embeddings that accurately estimates uncertainty and outperforms existing methods like MFVI and HMC, especially on large datasets.
Contribution
The paper presents a novel Gibbs sampler using Polya-Gamma augmentation for scalable Bayesian word embeddings, addressing non-identifiability and improving uncertainty estimation.
Findings
Gibbs sampler and HMC accurately estimate uncertainties.
MFVI fails to estimate uncertainties properly.
Posterior mean embeddings outperform MAP estimates in likelihood.
Abstract
Quantifying uncertainty in word embeddings is crucial for reliable inference from textual data. However, existing Bayesian methods such as Hamiltonian Monte Carlo (HMC) and mean-field variational inference (MFVI) are either computationally infeasible for large data or rely on restrictive assumptions. We propose a scalable Gibbs sampler using Polya-Gamma augmentation as well as Laplace approximation and compare them with MFVI and HMC for word embeddings. In addition, we address non-identifiability in word embeddings. Our Gibbs sampler and HMC correctly estimate uncertainties, while MFVI does not, and Laplace approximation only does so on large sample sizes, as expected. Applying the Gibbs sampler to the US Congress and the Movielens datasets, we demonstrate the feasibility on larger real data. Finally, as a result of having draws from the full posterior, we show that the posterior mean…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Bayesian Methods and Mixture Models · Computational and Text Analysis Methods
