Straight-Through Estimator as Projected Wasserstein Gradient Flow
Pengyu Cheng, Chang Liu, Chunyuan Li, Dinghan Shen, Ricardo Henao and, Lawrence Carin

TL;DR
This paper provides a theoretical foundation for the widely used Straight-Through estimator by interpreting it as a projected Wasserstein gradient flow, and introduces an improved estimator with better performance on certain distributions.
Contribution
It establishes a theoretical justification for ST as a projected Wasserstein gradient flow and proposes an enhanced estimator for distributions with infinite support.
Findings
ST can be interpreted as a projected Wasserstein gradient flow.
The proposed estimator outperforms existing methods on Poisson distributions.
Empirical results show comparable or better performance of ST and the new estimator.
Abstract
The Straight-Through (ST) estimator is a widely used technique for back-propagating gradients through discrete random variables. However, this effective method lacks theoretical justification. In this paper, we show that ST can be interpreted as the simulation of the projected Wasserstein gradient flow (pWGF). Based on this understanding, a theoretical foundation is established to justify the convergence properties of ST. Further, another pWGF estimator variant is proposed, which exhibits superior performance on distributions with infinite support,e.g., Poisson distributions. Empirically, we show that ST and our proposed estimator, while applied to different types of discrete structures (including both Bernoulli and Poisson latent variables), exhibit comparable or even better performances relative to other state-of-the-art methods. Our results uncover the origin of the widespread…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Groundwater flow and contamination studies · Generative Adversarial Networks and Image Synthesis
