# Diversity driven Attention Model for Query-based Abstractive   Summarization

**Authors:** Preksha Nema, Mitesh Khapra, Anirban Laha, Balaraman Ravindran

arXiv: 1704.08300 · 2018-07-16

## TL;DR

This paper introduces a diversity-driven attention model for query-based abstractive summarization, incorporating query-specific attention and a diversity mechanism to reduce repetition, resulting in significant performance improvements.

## Contribution

The paper proposes a novel encode-attend-decode model with query attention and diversity-based attention for query-based summarization, and introduces a new dataset for evaluation.

## Key findings

- Achieved 28% absolute gain in ROUGE-L scores over baseline models.
- Demonstrated effectiveness of diversity attention in reducing repetitive phrases.
- Validated the model on a newly created debatepedia-based dataset.

## Abstract

Abstractive summarization aims to generate a shorter version of the document covering all the salient points in a compact and coherent fashion. On the other hand, query-based summarization highlights those points that are relevant in the context of a given query. The encode-attend-decode paradigm has achieved notable success in machine translation, extractive summarization, dialog systems, etc. But it suffers from the drawback of generation of repeated phrases. In this work we propose a model for the query-based summarization task based on the encode-attend-decode paradigm with two key additions (i) a query attention model (in addition to document attention model) which learns to focus on different portions of the query at different time steps (instead of using a static representation for the query) and (ii) a new diversity based attention model which aims to alleviate the problem of repeating phrases in the summary. In order to enable the testing of this model we introduce a new query-based summarization dataset building on debatepedia. Our experiments show that with these two additions the proposed model clearly outperforms vanilla encode-attend-decode models with a gain of 28% (absolute) in ROUGE-L scores.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.08300/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1704.08300/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1704.08300/full.md

---
Source: https://tomesphere.com/paper/1704.08300