Hierarchical Attention: What Really Counts in Various NLP Tasks

Zehao Dou; Zhihua Zhang

arXiv:1808.03728·cs.CL·August 14, 2018

Hierarchical Attention: What Really Counts in Various NLP Tasks

Zehao Dou, Zhihua Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Hierarchical Attention Mechanism (Ham) that combines multiple attention layers, significantly improving NLP task performance and generalization over existing attention models.

Contribution

The paper proposes a novel hierarchical attention mechanism that integrates multi-level attention layers, enhancing NLP model performance and generalization capabilities.

Findings

01

Achieved a BLEU score of 0.26 in Chinese poem generation.

02

Improved machine reading comprehension accuracy by 6.5%.

03

Demonstrated greater generalization and representation ability than existing attention methods.

Abstract

Attention mechanisms in sequence to sequence models have shown great ability and wonderful performance in various natural language processing (NLP) tasks, such as sentence embedding, text generation, machine translation, machine reading comprehension, etc. Unfortunately, existing attention mechanisms only learn either high-level or low-level features. In this paper, we think that the lack of hierarchical mechanisms is a bottleneck in improving the performance of the attention mechanisms, and propose a novel Hierarchical Attention Mechanism (Ham) based on the weighted sum of different layers of a multi-level attention. Ham achieves a state-of-the-art BLEU score of 0.26 on Chinese poem generation task and a nearly 6.5% averaged improvement compared with the existing machine reading comprehension models such as BIDAF and Match-LSTM. Furthermore, our experiments and theorems reveal that Ham…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Disiok/poetry-seq2seq
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications