Towards Understanding Omission in Dialogue Summarization

Yicheng Zou; Kaitao Song; Xu Tan; Zhongkai Fu; Qi Zhang; Dongsheng Li,; Tao Gui

arXiv:2211.07145·cs.CL·May 12, 2023

Towards Understanding Omission in Dialogue Summarization

Yicheng Zou, Kaitao Song, Xu Tan, Zhongkai Fu, Qi Zhang, Dongsheng Li,, Tao Gui

PDF

Open Access 1 Repo

TL;DR

This paper introduces the OLDS dataset with omission labels for dialogue summarization, highlighting the importance of detecting omitted information to improve summarization quality and providing tools for further research.

Contribution

The paper presents a new dataset with omission labels and demonstrates the significance of omission detection in enhancing dialogue summarization.

Findings

01

Providing ground-truth omission labels improves summarization quality.

02

The OLDS dataset supports training and evaluation of omission detection.

03

Omission detection is crucial for reducing information loss in dialogue summaries.

Abstract

Dialogue summarization aims to condense the lengthy dialogue into a concise summary, and has recently achieved significant progress. However, the result of existing methods is still far from satisfactory. Previous works indicated that omission is a major factor in affecting the quality of summarization, but few of them have further explored the omission problem, such as how omission affects summarization results and how to detect omission, which is critical for reducing omission and improving summarization quality. Moreover, analyzing and detecting omission relies on summarization datasets with omission labels (i.e., which dialogue utterances are omitted in the summarization), which are not available in the current literature. In this paper, we propose the OLDS dataset, which provides high-quality Omission Labels for Dialogue Summarization. By analyzing this dataset, we find that a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/msummarizer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems