A Composite Activation Function for Learning Stable Binary Representations

Seokhun Park; Choeun Kim; Kwanho Lee; Sehyun Park; Insung Kong; Yongdai Kim

arXiv:2605.11558·cs.LG·May 13, 2026

A Composite Activation Function for Learning Stable Binary Representations

Seokhun Park, Choeun Kim, Kwanho Lee, Sehyun Park, Insung Kong, Yongdai Kim

PDF

TL;DR

This paper introduces HTAF, a smooth activation function that facilitates stable training of binary neural networks and interpretable models by approximating Heaviside functions with favorable gradient properties.

Contribution

The paper proposes HTAF, a novel composite activation function that enables stable gradient-based training of binary and spiking neural networks, and introduces ICBMs for interpretable discrete features.

Findings

01

HTAF maintains large gradient mass around zero inputs.

02

HTAF enables stable training of various binary neural network architectures.

03

ICBM achieves comparable or better performance with discrete features.

Abstract

Activation functions play a central role in neural networks by shaping internal representations. Recently, learning binary activation representations has attracted significant attention due to their advantages in computational and memory efficiency, as well as interpretability. However, training neural networks with Heaviside activations remains challenging, as their non-differentiability obstructs standard gradient-based optimization. In this paper, we propose Heavy Tailed Activation Function (HTAF), a smooth approximation to the Heaviside function that enables stable training with gradient-based optimization. We construct HTAF as a sigmoid hyperbolic tangent composite function and theoretically show that it maintains a large gradient mass around zero inputs while exhibiting slower gradient decay in the tail regions. We show that Spiking Neural Networks, Binary Neural Networks and Deep…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.