Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding
Deunsol Yoon, Dongbok Lee, SangKeun Lee

TL;DR
This paper introduces Dynamic Self-Attention (DSA), a novel attention mechanism for sentence embedding that dynamically weights words, achieving state-of-the-art results with fewer parameters on key NLP benchmarks.
Contribution
The paper presents DSA, a new self-attention method inspired by capsule networks, improving sentence embedding performance efficiently.
Findings
Achieved state-of-the-art results on SNLI dataset.
Demonstrated competitive performance on SST dataset.
Used fewer parameters than existing methods.
Abstract
In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to informative words with a dynamic weight vector. We achieve new state-of-the-art results among sentence encoding methods in Stanford Natural Language Inference (SNLI) dataset with the least number of parameters, while showing comparative results in Stanford Sentiment Treebank (SST) dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
