ARM: Augment-REINFORCE-Merge Gradient for Stochastic Binary Networks
Mingzhang Yin, Mingyuan Zhou

TL;DR
The paper introduces the ARM estimator, a novel gradient estimator for stochastic binary networks that is unbiased, low-variance, and computationally efficient, improving training in models with discrete stochastic layers.
Contribution
The paper proposes the ARM estimator, combining augmentation, REINFORCE, and merging techniques to achieve adaptive variance reduction for stochastic binary network training.
Findings
ARM outperforms existing estimators in variational inference tasks.
ARM achieves lower variance in gradient estimates.
Experimental results demonstrate improved model performance.
Abstract
To backpropagate the gradients through stochastic binary layers, we propose the augment-REINFORCE-merge (ARM) estimator that is unbiased, exhibits low variance, and has low computational complexity. Exploiting variable augmentation, REINFORCE, and reparameterization, the ARM estimator achieves adaptive variance reduction for Monte Carlo integration by merging two expectations via common random numbers. The variance-reduction mechanism of the ARM estimator can also be attributed to either antithetic sampling in an augmented space, or the use of an optimal anti-symmetric "self-control" baseline function together with the REINFORCE estimator in that augmented space. Experimental results show the ARM estimator provides state-of-the-art performance in auto-encoding variational inference and maximum likelihood estimation, for discrete latent variable models with one or multiple stochastic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning
MethodsREINFORCE
