r-softmax: Generalized Softmax with Controllable Sparsity Rate
Klaudia Ba{\l}azy, {\L}ukasz Struski, Marek \'Smieja, Jacek Tabor

TL;DR
The paper introduces r-softmax, a novel softmax variant that produces sparse probability distributions with controllable sparsity, improving performance in multi-label classification and NLP tasks.
Contribution
It proposes a new softmax modification with an intuitive control mechanism for sparsity, outperforming existing sparse functions and enhancing transformer model fine-tuning.
Findings
r-softmax outperforms other sparse probability functions on multi-label datasets
Applying r-softmax to transformer models improves NLP task performance
r-softmax is highly competitive with the original softmax in various settings
Abstract
Nowadays artificial neural network models achieve remarkable results in many disciplines. Functions mapping the representation provided by the model to the probability distribution are the inseparable aspect of deep learning solutions. Although softmax is a commonly accepted probability mapping function in the machine learning community, it cannot return sparse outputs and always spreads the positive probability to all positions. In this paper, we propose r-softmax, a modification of the softmax, outputting sparse probability distribution with controllable sparsity rate. In contrast to the existing sparse probability mapping functions, we provide an intuitive mechanism for controlling the output sparsity level. We show on several multi-label datasets that r-softmax outperforms other sparse alternatives to softmax and is highly competitive with the original softmax. We also apply…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsSoftmax
