On The Statistical Representation Properties Of The Perturb-Softmax And The Perturb-Argmax Probability Distributions
Hedda Cohen Indelman, Tamir Hazan

TL;DR
This paper explores the statistical properties of Perturb-Softmax and Perturb-Argmax distributions, identifying conditions for their completeness and minimality, and extends these findings to other models like Gaussian-Softmax, with practical validation.
Contribution
It provides a theoretical analysis of the representation properties of Perturb-Softmax and Perturb-Argmax distributions, including extensions to Gaussian models, with experimental validation.
Findings
Identifies parameter sets for complete and minimal representations.
Extends the framework to Gaussian-Softmax and Gaussian-Argmax.
Experimental results show faster convergence of extended models.
Abstract
The Gumbel-Softmax probability distribution allows learning discrete tokens in generative learning, while the Gumbel-Argmax probability distribution is useful in learning discrete structures in discriminative learning. Despite the efforts invested in optimizing these probability models, their statistical properties are under-explored. In this work, we investigate their representation properties and determine for which families of parameters these probability distributions are complete, i.e., can represent any probability distribution, and minimal, i.e., can represent a probability distribution uniquely. We rely on convexity and differentiability to determine these statistical conditions and extend this framework to general probability models, such as Gaussian-Softmax and Gaussian-Argmax. We experimentally validate the qualities of these extensions, which enjoy a faster convergence rate.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Statistical Methods and Models · Target Tracking and Data Fusion in Sensor Networks
