Generalized Neural Sorting Networks with Error-Free Differentiable Swap   Functions

Jungtaek Kim; Jeongbeen Yoon; Minsu Cho

arXiv:2310.07174·cs.LG·March 15, 2024

Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions

Jungtaek Kim, Jeongbeen Yoon, Minsu Cho

PDF

Open Access 1 Repo

TL;DR

This paper introduces an error-free differentiable swap function for neural sorting networks, enabling sorting of complex high-dimensional inputs with improved performance demonstrated on various benchmarks.

Contribution

We develop a novel error-free differentiable swap function and integrate it into a permutation-equivariant Transformer, enhancing neural sorting capabilities for complex data.

Findings

01

Outperforms baseline methods on multiple sorting benchmarks

02

Effectively handles high-dimensional and complex inputs

03

Maintains differentiability and non-decreasing conditions

Abstract

Sorting is a fundamental operation of all computer systems, having been a long-standing significant research topic. Beyond the problem formulation of traditional sorting algorithms, we consider sorting problems for more abstract yet expressive inputs, e.g., multi-digit images and image fragments, through a neural sorting network. To learn a mapping from a high-dimensional input to an ordinal variable, the differentiability of sorting networks needs to be guaranteed. In this paper we define a softening error by a differentiable swap function, and develop an error-free swap function that holds a non-decreasing condition and differentiability. Furthermore, a permutation-equivariant Transformer network with multi-head attention is adopted to capture dependency between given inputs and also leverage its model capacity with self-attention. Experiments on diverse sorting benchmarks show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jungtaekkim/error-free-differentiable-swap-functions
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Face and Expression Recognition · Fuzzy Logic and Control Systems

MethodsMulti-Head Attention · Attention Is All You Need · Position-Wise Feed-Forward Layer · Softmax · Byte Pair Encoding · Linear Layer · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection