Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network Performance

Youngmin Seo; Jinha Kim; Unsang Park

arXiv:2407.01012·cs.LG·April 6, 2026

Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network Performance

Youngmin Seo, Jinha Kim, Unsang Park

PDF

1 Repo

TL;DR

Swish-T introduces a Tanh bias to the Swish activation, creating variants that improve neural network performance across diverse tasks and datasets, with empirical validation and publicly available code.

Contribution

The paper proposes Swish-T, a novel activation function family with Tanh bias, demonstrating improved performance and flexibility over the original Swish function.

Findings

01

Swish-T variants outperform Swish on multiple benchmarks.

02

Swish-T$_{C}$ achieves high performance even without parameter tuning.

03

Empirical results on datasets like MNIST, CIFAR-10, and CIFAR-100 validate effectiveness.

Abstract

We propose the Swish-T family, an enhancement of the existing non-monotonic activation function Swish. Swish-T is defined by adding a Tanh bias to the original Swish function. This modification creates a family of Swish-T variants, each designed to excel in different tasks, showcasing specific advantages depending on the application context. The Tanh bias allows for broader acceptance of negative values during initial training stages, offering a smoother non-monotonic curve than the original Swish. We ultimately propose the Swish-T $_{C}$ function, while Swish-T and Swish-T $_{B}$ , byproducts of Swish-T $_{C}$ , also demonstrate satisfactory performance. Furthermore, our ablation study shows that using Swish-T $_{C}$ as a non-parametric function can still achieve high performance. The superiority of the Swish-T family has been empirically demonstrated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ictseoyoungmin/Swish-T-pytorch
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.