TransAxx: Efficient Transformers with Approximate Computing

Dimitrios Danopoulos; Georgios Zervakis; Dimitrios Soudris; J\"org; Henkel

arXiv:2402.07545·cs.LG·May 8, 2025·1 cites

TransAxx: Efficient Transformers with Approximate Computing

Dimitrios Danopoulos, Georgios Zervakis, Dimitrios Soudris, J\"org, Henkel

PDF

Open Access

TL;DR

TransAxx introduces a framework for evaluating and optimizing Vision Transformer models with approximate computing techniques, enabling significant power savings while maintaining accuracy on low-power devices.

Contribution

The paper presents TransAxx, a novel PyTorch-based framework that integrates approximate arithmetic into ViT models and proposes a Monte Carlo Tree Search method for generating efficient approximate accelerators.

Findings

01

Approximate multipliers can be effectively used in ViT models.

02

Approximate-aware finetuning recovers accuracy lost due to approximation.

03

Significant power savings achieved with minimal accuracy loss.

Abstract

Vision Transformer (ViT) models which were recently introduced by the transformer architecture have shown to be very competitive and often become a popular alternative to Convolutional Neural Networks (CNNs). However, the high computational requirements of these models limit their practical applicability especially on low-power devices. Current state-of-the-art employs approximate multipliers to address the highly increased compute demands of DNN accelerators but no prior research has explored their use on ViT models. In this work we propose TransAxx, a framework based on the popular PyTorch library that enables fast inherent support for approximate arithmetic to seamlessly evaluate the impact of approximate computing on DNNs such as ViT models. Using TransAxx we analyze the sensitivity of transformer models on the ImageNet dataset to approximate multiplications and perform…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Neural Networks and Reservoir Computing · Ferroelectric and Negative Capacitance Devices

MethodsAttention Is All You Need · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Softmax · Byte Pair Encoding · Linear Layer · Multi-Head Attention · Residual Connection