SoftSort: A Continuous Relaxation for the argsort Operator
Sebastian Prillo, Julian Martin Eisenschlos

TL;DR
This paper introduces a simple, fast, and effective continuous relaxation for the argsort operator, enabling gradient-based learning with improved performance and mathematical simplicity.
Contribution
It proposes a new continuous relaxation for argsort that is easy to implement, computationally efficient, and improves upon existing methods.
Findings
Achieves state-of-the-art performance
Faster than competing approaches
Easy to implement in three lines of code
Abstract
While sorting is an important procedure in computer science, the argsort operator - which takes as input a vector and returns its sorting permutation - has a discrete image and thus zero gradients almost everywhere. This prohibits end-to-end, gradient-based learning of models that rely on the argsort operator. A natural way to overcome this problem is to replace the argsort operator with a continuous relaxation. Recent work has shown a number of ways to do this, but the relaxations proposed so far are computationally complex. In this work we propose a simple continuous relaxation for the argsort operator which has the following qualities: it can be implemented in three lines of code, achieves state-of-the-art performance, is easy to reason about mathematically - substantially simplifying proofs - and is faster than competing approaches. We open source the code to reproduce all of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications
