CNewSum: A Large-scale Chinese News Summarization Dataset with Human-annotated Adequacy and Deducibility Level
Danqing Wang, Jiaze Chen, Xianze Wu, Hao Zhou, Lei Li

TL;DR
CNewSum is a large-scale Chinese news summarization dataset with human annotations on adequacy and deducibility, facilitating research in document understanding and generation for Chinese text summarization.
Contribution
The paper introduces CNewSum, a large-scale Chinese news summarization dataset with novel annotations on adequacy and deducibility, addressing the lack of resources in Chinese summarization research.
Findings
Recent methods evaluated on CNewSum
Annotations enable analysis of model performance
Dataset supports future Chinese summarization research
Abstract
Automatic text summarization aims to produce a brief but crucial summary for the input documents. Both extractive and abstractive methods have witnessed great success in English datasets in recent years. However, there has been a minimal exploration of text summarization in Chinese, limited by the lack of large-scale datasets. In this paper, we present a large-scale Chinese news summarization dataset CNewSum, which consists of 304,307 documents and human-written summaries for the news feed. It has long documents with high-abstractive summaries, which can encourage document-level understanding and generation for current summarization models. An additional distinguishing feature of CNewSum is that its test set contains adequacy and deducibility annotations for the summaries. The adequacy level measures the degree of summary information covered by the document, and the deducibility…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsTest
