Exploring Alternatives to Softmax Function
Kunal Banerjee, Vishak Prasad C, Rishi Raj Gupta, Karthik Vyas,, Anushree H, Biswajit Mishra

TL;DR
This paper investigates various softmax alternatives, including Taylor softmax, SM-softmax, and a new SM-Taylor softmax, demonstrating that proper configurations can outperform the standard softmax in image classification tasks.
Contribution
It introduces SM-Taylor softmax, combining Taylor and soft-margin softmax, and explores the effects of expanding Taylor series to improve softmax alternatives.
Findings
SM-Taylor softmax outperforms standard softmax in experiments.
Expanding Taylor series up to ten terms enhances performance.
Different configurations of SM-Taylor softmax yield optimal results.
Abstract
Softmax function is widely used in artificial neural networks for multiclass classification, multilabel classification, attention mechanisms, etc. However, its efficacy is often questioned in literature. The log-softmax loss has been shown to belong to a more generic class of loss functions, called spherical family, and its member log-Taylor softmax loss is arguably the best alternative in this class. In another approach which tries to enhance the discriminative nature of the softmax function, soft-margin softmax (SM-softmax) has been proposed to be the most suitable alternative. In this work, we investigate Taylor softmax, SM-softmax and our proposed SM-Taylor softmax, an amalgamation of the earlier two functions, as alternatives to softmax function. Furthermore, we explore the effect of expanding Taylor softmax up to ten terms (original work proposed expanding only to two terms) along…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
MethodsSoftmax
