Neural Attention: Enhancing QKV Calculation in Self-Attention Mechanism with Neural Networks
Muhan Zhang

TL;DR
This paper introduces a neural network-based approach to compute QKV in self-attention, leading to improved translation quality and language modeling performance over traditional linear methods.
Contribution
It proposes a novel neural network structure for QKV calculation in self-attention, demonstrating significant performance gains in translation and language modeling tasks.
Findings
Enhanced BLEU scores in translation tasks
Reduced perplexity in language modeling
Validated effectiveness across multiple datasets
Abstract
In the realm of deep learning, the self-attention mechanism has substantiated its pivotal role across a myriad of tasks, encompassing natural language processing and computer vision. Despite achieving success across diverse applications, the traditional self-attention mechanism primarily leverages linear transformations for the computation of query, key, and value (QKV), which may not invariably be the optimal choice under specific circumstances. This paper probes into a novel methodology for QKV computation-implementing a specially-designed neural network structure for the calculation. Utilizing a modified Marian model, we conducted experiments on the IWSLT 2017 German-English translation task dataset and juxtaposed our method with the conventional approach. The experimental results unveil a significant enhancement in BLEU scores with our method. Furthermore, our approach also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Stock Market Forecasting Methods
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Softmax · Adam · WordPiece
