An Exploratory Study on Long Dialogue Summarization: What Works and   What's Next

Yusen Zhang; Ansong Ni; Tao Yu; Rui Zhang; Chenguang Zhu; Budhaditya; Deb; Asli Celikyilmaz; Ahmed Hassan Awadallah; Dragomir Radev

arXiv:2109.04609·cs.CL·September 13, 2021

An Exploratory Study on Long Dialogue Summarization: What Works and What's Next

Yusen Zhang, Ansong Ni, Tao Yu, Rui Zhang, Chenguang Zhu, Budhaditya, Deb, Asli Celikyilmaz, Ahmed Hassan Awadallah, Dragomir Radev

PDF

Open Access 1 Repo

TL;DR

This study explores methods for long dialogue summarization, comparing extended transformers, retrieval-based, and hierarchical models, and finds retrieve-then-summarize approaches perform best on multiple datasets.

Contribution

It provides a comprehensive comparison of three strategies for long dialogue summarization and highlights the effectiveness of retrieve-then-summarize pipelines with improved retrieval and pretraining.

Findings

01

Retrieve-then-summarize models outperform other strategies.

02

Stronger retrieval models improve summary quality.

03

Pretraining on external datasets enhances performance.

Abstract

Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series. However, real-world dialogues pose a great challenge to current summarization models, as the dialogue length typically exceeds the input limits imposed by recent transformer-based pre-trained models, and the interactive nature of dialogues makes relevant information more context-dependent and sparsely distributed than news articles. In this work, we perform a comprehensive study on long dialogue summarization by investigating three strategies to deal with the lengthy input problem and locate relevant information: (1) extended transformer models such as Longformer, (2) retrieve-then-summarize pipeline models with several dialogue utterance retrieval methods, and (3) hierarchical dialogue encoding models such as HMNet. Our experimental results on three long…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chatc/longdialsumm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Layer Normalization · How do I make a claim with Expedia?*Make FastClaimService · Linear Warmup With Linear Decay · Softmax · How do I complain to Expedia?*ComplainByAgent · AdamW