A Quantitative Evaluation of Approximate Softmax Functions for Deep Neural Networks

Anthony Leiva-Valverde; Fabricio Elizondo-Fern\'andez; Luis G. Le\'on-Vega; Cristina Meinhardt; Jorge Castro-God\'inez

arXiv:2501.13379·cs.AR·April 9, 2026

A Quantitative Evaluation of Approximate Softmax Functions for Deep Neural Networks

Anthony Leiva-Valverde, Fabricio Elizondo-Fern\'andez, Luis G. Le\'on-Vega, Cristina Meinhardt, Jorge Castro-God\'inez

PDF

TL;DR

This paper evaluates approximate computing methods for softmax functions in neural networks, focusing on FPGA implementation efficiency and accuracy trade-offs.

Contribution

It compares Taylor series and LUT-based interpolation techniques, demonstrating their effectiveness in resource-constrained FPGA environments.

Findings

01

Taylor approximations outperform in execution time and resource efficiency.

02

Quadratic interpolation with LUTs has the lowest numerical error.

03

Achieved up to 0.2% accuracy loss with 14% resource savings on real models.

Abstract

The softmax function is a widely used activation function in the output layers of neural networks, responsible for converting raw scores into class probabilities while introducing essential non-linearity. Implementing Softmax efficiently poses challenges on low-end FPGAs due to limited hardware resources and the computational complexity of exponential and division operations. This work evaluates approximate computing techniques for softmax acceleration using Taylor series and interpolation methods using Look-Up Tables (LUTs). These approximations aim to reduce execution time and resource consumption while maintaining acceptable levels of numerical precision. Our findings show that quadratic interpolation with LUTs yields the lowest numerical error. In contrast, Taylor-based approximations offer significantly better performance in terms of execution time and resource efficiency due to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.