EmailSum: Abstractive Email Thread Summarization

Shiyue Zhang; Asli Celikyilmaz; Jianfeng Gao; Mohit Bansal

arXiv:2107.14691·cs.CL·August 2, 2021·1 cites

EmailSum: Abstractive Email Thread Summarization

Shiyue Zhang, Asli Celikyilmaz, Jianfeng Gao, Mohit Bansal

PDF

Open Access 1 Repo

TL;DR

This paper introduces EmailSum, a new dataset for email thread summarization, and conducts extensive experiments revealing challenges in current models and the need for better evaluation metrics, emphasizing human judgment.

Contribution

The paper presents a large, human-annotated dataset for email thread summarization and provides a comprehensive empirical study of various summarization techniques and evaluation methods.

Findings

01

Current models struggle with understanding sender intent and roles.

02

Automatic metrics like ROUGE and BERTScore are weakly correlated with human judgments.

03

Human evaluation is crucial for assessing summary quality.

Abstract

Recent years have brought about an interest in the challenging task of summarizing conversation threads (meetings, online discussions, etc.). Such summaries help analysis of the long text to quickly catch up with the decisions made and thus improve our work or communication efficiency. To spur research in thread summarization, we have developed an abstractive Email Thread Summarization (EmailSum) dataset, which contains human-annotated short (<30 words) and long (<100 words) summaries of 2549 email threads (each containing 3 to 10 emails) over a wide variety of topics. We perform a comprehensive empirical study to explore different summarization techniques (including extractive and abstractive methods, single-document and hierarchical models, as well as transfer and semisupervised learning) and conduct human evaluations on both short and long summary generation tasks. Our results reveal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ZhangShiyue/EmailSum
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques

MethodsLinear Layer · Byte Pair Encoding · Gated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Inverse Square Root Schedule · Adafactor · Dense Connections · Softmax · Attention Dropout · Dropout