Successive Halving Top-k Operator
Micha{\l} Pietruszka, {\L}ukasz Borchmann, Filip Grali\'nski

TL;DR
This paper introduces a differentiable successive halving method for the top-k operator, enabling gradient-based optimization with improved approximation and reduced computational cost through a tournament-style selection process.
Contribution
It presents a novel, efficient approximation of the top-k operator that is differentiable, avoiding iterative softmax and lowering computational complexity.
Findings
Achieves better top-k approximation with lower computational cost
Uses tournament-style selection for differentiability
Enables gradient-based optimization for top-k operations
Abstract
We propose a differentiable successive halving method of relaxing the top-k operator, rendering gradient-based optimization possible. The need to perform softmax iteratively on the entire vector of scores is avoided by using a tournament-style selection. As a result, a much better approximation of top-k with lower computational cost is achieved compared to the previous approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Image Processing Techniques · Stochastic Gradient Optimization Techniques
MethodsSoftmax
