Dynamic Self-Attention : Computing Attention over Words Dynamically for   Sentence Embedding

Deunsol Yoon; Dongbok Lee; SangKeun Lee

arXiv:1808.07383·cs.LG·August 23, 2018·40 cites

Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding

Deunsol Yoon, Dongbok Lee, SangKeun Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces Dynamic Self-Attention (DSA), a novel attention mechanism for sentence embedding that dynamically weights words, achieving state-of-the-art results with fewer parameters on key NLP benchmarks.

Contribution

The paper presents DSA, a new self-attention method inspired by capsule networks, improving sentence embedding performance efficiently.

Findings

01

Achieved state-of-the-art results on SNLI dataset.

02

Demonstrated competitive performance on SST dataset.

03

Used fewer parameters than existing methods.

Abstract

In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to informative words with a dynamic weight vector. We achieve new state-of-the-art results among sentence encoding methods in Stanford Natural Language Inference (SNLI) dataset with the least number of parameters, while showing comparative results in Stanford Sentiment Treebank (SST) dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dsindex/iclassifier
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks