Successive Halving Top-k Operator

Micha{\l} Pietruszka; {\L}ukasz Borchmann; Filip Grali\'nski

arXiv:2010.15552·cs.LG·October 30, 2020

Successive Halving Top-k Operator

Micha{\l} Pietruszka, {\L}ukasz Borchmann, Filip Grali\'nski

PDF

Open Access 1 Repo

TL;DR

This paper introduces a differentiable successive halving method for the top-k operator, enabling gradient-based optimization with improved approximation and reduced computational cost through a tournament-style selection process.

Contribution

It presents a novel, efficient approximation of the top-k operator that is differentiable, avoiding iterative softmax and lowering computational complexity.

Findings

01

Achieves better top-k approximation with lower computational cost

02

Uses tournament-style selection for differentiability

03

Enables gradient-based optimization for top-k operations

Abstract

We propose a differentiable successive halving method of relaxing the top-k operator, rendering gradient-based optimization possible. The need to perform softmax iteratively on the entire vector of scores is avoided by using a tournament-style selection. As a result, a much better approximation of top-k with lower computational cost is achieved compared to the previous approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

applicaai/successive-halving-topk
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Advanced Image Processing Techniques · Stochastic Gradient Optimization Techniques

MethodsSoftmax