The Inhibitor: ReLU and Addition-Based Attention for Efficient Transformers under Fully Homomorphic Encryption on the Torus
Rickard Br\"annvall, Andrei Stoian

TL;DR
This paper introduces a ReLU and addition-based attention mechanism for quantized Transformers that reduces computational complexity, enabling efficient, privacy-preserving AI under homomorphic encryption without significantly sacrificing accuracy.
Contribution
The paper proposes a novel attention mechanism replacing dot-product and Softmax with addition and ReLU, improving efficiency and privacy-preserving capabilities in encrypted environments.
Findings
Comparable accuracy to traditional Transformers on benchmark tasks
Significant computational savings in plaintext and encrypted settings
Potential for enabling privacy-preserving AI applications
Abstract
To enhance the computational efficiency of quantized Transformers, we replace the dot-product and Softmax-based attention with an alternative mechanism involving addition and ReLU activation only. This side-steps the expansion to double precision often required by matrix multiplication and avoids costly Softmax evaluations but maintains much of the core functionality of conventional dot-product attention. It can enable more efficient execution and support larger quantized Transformer models on resource-constrained hardware or alternative arithmetic systems like homomorphic encryption. Training experiments on four common benchmark tasks show test set prediction scores comparable to those of conventional Transformers with dot-product attention. Our scaling experiments also suggest significant computational savings, both in plaintext and under encryption. In particular, we believe that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Stochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data
MethodsMulti-Head Attention · Dense Connections · Linear Layer · Label Smoothing · Absolute Position Encodings · Attention Is All You Need · Adam · Residual Connection · Layer Normalization · Softmax
