Differentiable Sampling of Categorical Distributions Using the CatLog-Derivative Trick
Lennert De Smet, Emanuele Sansone, Pedro Zuidberg Dos Martires

TL;DR
This paper introduces the CatLog-Derivative trick, a new method for differentiating categorical distributions, and proposes IndeCateR, an unbiased gradient estimator with lower variance, improving learning in models with categorical latent variables.
Contribution
The paper presents the CatLog-Derivative trick tailored for categorical distributions and introduces IndeCateR, a novel unbiased gradient estimator with lower variance than REINFORCE.
Findings
IndeCateR achieves lower variance than REINFORCE.
Gradient estimates with IndeCateR have reduced bias and variance.
Efficient implementation of IndeCateR demonstrates practical advantages.
Abstract
Categorical random variables can faithfully represent the discrete and uncertain aspects of data as part of a discrete latent variable model. Learning in such models necessitates taking gradients with respect to the parameters of the categorical probability distributions, which is often intractable due to their combinatorial nature. A popular technique to estimate these otherwise intractable gradients is the Log-Derivative trick. This trick forms the basis of the well-known REINFORCE gradient estimator and its many extensions. While the Log-Derivative trick allows us to differentiate through samples drawn from categorical distributions, it does not take into account the discrete nature of the distribution itself. Our first contribution addresses this shortcoming by introducing the CatLog-Derivative trick - a variation of the Log-Derivative trick tailored towards categorical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Machine Learning and Algorithms
MethodsREINFORCE
