Element-Wise Attention Layers: an option for optimization
Giovanni Araujo Bacochina, Rodrigo Clemente Thom de Souza

TL;DR
This paper introduces element-wise attention layers that adapt dot-product attention to reduce parameter count, maintaining competitive accuracy on image classification tasks with significantly fewer parameters, thus optimizing model efficiency.
Contribution
It proposes a novel element-wise attention mechanism that simplifies dot-product attention, enabling more efficient models with fewer parameters without substantial accuracy loss.
Findings
Achieved 92% accuracy on Fashion MNIST with 97% fewer parameters.
Maintained 60% accuracy on CIFAR10 with 50% fewer parameters.
Demonstrated effectiveness on VGG-like architectures using standard datasets.
Abstract
The use of Attention Layers has become a trend since the popularization of the Transformer-based models, being the key element for many state-of-the-art models that have been developed through recent years. However, one of the biggest obstacles in implementing these architectures - as well as many others in Deep Learning Field - is the enormous amount of optimizing parameters they possess, which make its use conditioned on the availability of robust hardware. In this paper, it's proposed a new method of attention mechanism that adapts the Dot-Product Attention, which uses matrices multiplications, to become element-wise through the use of arrays multiplications. To test the effectiveness of such approach, two models (one with a VGG-like architecture and one with the proposed method) have been trained in a classification task using Fashion MNIST and CIFAR10 datasets. Each model has been…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Advanced Data and IoT Technologies
MethodsTest
