An Introductory Survey on Attention Mechanisms in NLP Problems

Dichao Hu

arXiv:1811.05544·cs.CL·November 15, 2018·40 cites

An Introductory Survey on Attention Mechanisms in NLP Problems

Dichao Hu

PDF

Open Access

TL;DR

This paper provides an introductory survey of attention mechanisms in NLP, covering their variants, applications, evaluation methods, and relationship with other machine learning techniques, to help readers understand this widely used method.

Contribution

It offers a comprehensive overview of attention mechanisms in NLP, summarizing recent research, variants, and evaluation approaches for the first time in a single survey.

Findings

01

Attention mechanisms significantly improve NLP tasks.

02

Various variants of attention are tailored for different NLP problems.

03

Evaluation methods for attention performance are discussed.

Abstract

First derived from human intuition, later adapted to machine translation for automatic token alignment, attention mechanism, a simple method that can be used for encoding sequence data based on the importance score each element is assigned, has been widely applied to and attained significant improvement in various tasks in natural language processing, including sentiment classification, text summarization, question answering, dependency parsing, etc. In this paper, we survey through recent works and conduct an introductory summary of the attention mechanism in different NLP problems, aiming to provide our readers with basic knowledge on this widely used method, discuss its different variants for different tasks, explore its association with other techniques in machine learning, and examine methods for evaluating its performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques