Pathwise Derivatives Beyond the Reparameterization Trick
Martin Jankowiak, Fritz Obermeyer

TL;DR
This paper extends pathwise gradient methods beyond the reparameterization trick by leveraging optimal transport theory, enabling efficient gradient computation for complex distributions like Gamma, Beta, and Dirichlet.
Contribution
It introduces a novel perspective linking reparameterization gradients to the transport equation, deriving optimal gradients with lower variance for complex distributions.
Findings
Pathwise gradients for Gamma, Beta, Dirichlet distributions are effectively computed.
Optimal gradients reduce variance compared to standard reparameterization.
Method performs competitively in synthetic and variational inference tasks.
Abstract
We observe that gradients computed via the reparameterization trick are in direct correspondence with solutions of the transport equation in the formalism of optimal transport. We use this perspective to compute (approximate) pathwise gradients for probability distributions not directly amenable to the reparameterization trick: Gamma, Beta, and Dirichlet. We further observe that when the reparameterization trick is applied to the Cholesky-factorized multivariate Normal distribution, the resulting gradients are suboptimal in the sense of optimal transport. We derive the optimal gradients and show that they have reduced variance in a Gaussian Process regression task. We demonstrate with a variety of synthetic experiments and stochastic variational inference tasks that our pathwise gradients are competitive with other methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Model Reduction and Neural Networks · Neural Networks and Applications
MethodsGaussian Process
