TYTAN: Taylor-series based Non-Linear Activation Engine for Deep Learning Accelerators
Soham Pramanik, Vimal William, Arnab Raha, Debayan Das, Amitava Mukherjee, Janet L. Paluh

TL;DR
TYTAN introduces a hardware and algorithmic approach to efficiently accelerate non-linear activation functions in AI accelerators, significantly improving performance and reducing power consumption for edge AI inference.
Contribution
The paper presents TYTAN, a novel Taylor-series based non-linear activation engine that enhances energy efficiency and speed in AI accelerators through dynamic approximation and reconfigurable hardware.
Findings
Operates at >950 MHz clock frequency.
Achieves ~2x performance improvement over NVDLA.
Reduces power consumption by ~56%.
Abstract
The rapid advancement in AI architectures and the proliferation of AI-enabled systems have intensified the need for domain-specific architectures that enhance both the acceleration and energy efficiency of AI inference, particularly at the edge. This need arises from the significant resource constraints-such as computational cost and energy consumption-associated with deploying AI algorithms, which involve intensive mathematical operations across multiple layers. High-power-consuming operations, including General Matrix Multiplications (GEMMs) and activation functions, can be optimized to address these challenges. Optimization strategies for AI at the edge include algorithmic approaches like quantization and pruning, as well as hardware methodologies such as domain-specific accelerators. This paper proposes TYTAN: TaYlor-series based non-linear acTivAtion eNgine, which explores the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Parallel Computing and Optimization Techniques
