Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models
Pasquale Minervini, Luca Franceschi, Mathias Niepert

TL;DR
This paper introduces AIMLE, an adaptive gradient estimator for discrete models that improves accuracy and efficiency by automatically balancing bias and gradient information density, outperforming existing methods.
Contribution
AIMLE is the first adaptive gradient estimator for complex discrete distributions, reducing sample complexity and improving estimation fidelity.
Findings
AIMLE produces more accurate gradient estimates with fewer samples.
It outperforms existing estimators on synthetic and real tasks.
AIMLE adaptively balances bias and gradient information density.
Abstract
The integration of discrete algorithmic components in deep learning architectures has numerous applications. Recently, Implicit Maximum Likelihood Estimation (IMLE, Niepert, Minervini, and Franceschi 2021), a class of gradient estimators for discrete exponential family distributions, was proposed by combining implicit differentiation through perturbation with the path-wise gradient estimator. However, due to the finite difference approximation of the gradients, it is especially sensitive to the choice of the finite difference step size, which needs to be specified by the user. In this work, we present Adaptive IMLE (AIMLE), the first adaptive gradient estimator for complex discrete distributions: it adaptively identifies the target distribution for IMLE by trading off the density of gradient information with the degree of bias in the gradient estimates. We empirically evaluate our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis
