Low-variance Black-box Gradient Estimates for the Plackett-Luce   Distribution

Artyom Gadetsky; Kirill Struminsky; Christopher Robinson; Novi; Quadrianto; Dmitry Vetrov

arXiv:1911.10036·cs.LG·November 25, 2019

Low-variance Black-box Gradient Estimates for the Plackett-Luce Distribution

Artyom Gadetsky, Kirill Struminsky, Christopher Robinson, Novi, Quadrianto, Dmitry Vetrov

PDF

1 Repo

TL;DR

This paper introduces low-variance gradient estimators for models with permutation-based latent variables, enabling efficient stochastic optimization in complex discrete settings.

Contribution

It proposes control variates for the Plackett-Luce distribution, improving gradient estimates for permutation models and broadening optimization capabilities.

Findings

01

Outperforms relaxation-based methods in causal structure learning tasks

02

Effective for both continuous and discrete data

03

Applicable to non-differentiable score functions

Abstract

Learning models with discrete latent variables using stochastic gradient descent remains a challenge due to the high variance of gradient estimates. Modern variance reduction techniques mostly consider categorical distributions and have limited applicability when the number of possible outcomes becomes large. In this work, we consider models with latent permutations and propose control variates for the Plackett-Luce distribution. In particular, the control variates allow us to optimize black-box functions over permutations using stochastic gradient descent. To illustrate the approach, we consider a variety of causal structure learning tasks for continuous and discrete data. We show that our method outperforms competitive relaxation-based optimization methods and is also applicable to non-differentiable score functions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

agadetsky/pytorch-pl-variance-reduction
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.