Fast Gumbel-Max Sketch and its Applications
Yuanming Zhang, Pinghui Wang, Yiyan Qi, Kuankuan Cheng and, Junzhou Zhao, Guangjian Tian, Xiaohong Guan

TL;DR
This paper introduces FastGM, an efficient algorithm for generating multiple Gumbel-Max variables from high-dimensional data, significantly reducing computation time while maintaining accuracy.
Contribution
FastGM reduces the complexity of generating multiple Gumbel-Max variables from O(kn) to O(k log k + n), enabling faster sampling in large-scale applications.
Findings
FastGM is orders of magnitude faster than existing methods.
FastGM maintains accuracy without additional costs.
Experimental results validate the efficiency and effectiveness of FastGM.
Abstract
The well-known Gumbel-Max Trick for sampling elements from a categorical distribution (or more generally a non-negative vector) and its variants have been widely used in areas such as machine learning and information retrieval. To sample a random element in proportion to its positive weight , the Gumbel-Max Trick first computes a Gumbel random variable for each positive weight element , and then samples the element with the largest value of . Recently, applications including similarity estimation and weighted cardinality estimation require to generate independent Gumbel-Max variables from high dimensional vectors. However, it is computationally expensive for a large (e.g., hundreds or even thousands) when using the traditional Gumbel-Max Trick. To solve this problem, we propose a novel algorithm, FastGM, which reduces the time complexity from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Advanced Graph Neural Networks
