CERD: A Comprehensive Chinese Rhetoric Dataset for Rhetorical Understanding and Generation in Essays
Nuowei Liu, Xinhao Chen, Hongyi Wu, Changzhi Sun, Man Lan, Yuanbin Wu,, Xiaopeng Bai, Shaoguang Mao, Yan Xia

TL;DR
This paper introduces CERD, a comprehensive Chinese rhetoric dataset with multiple interrelated tasks, enabling better understanding and generation of rhetorical devices in essays, and establishes benchmarks for future research.
Contribution
The paper presents CERD, a novel, manually annotated Chinese rhetoric dataset with interrelated sub-tasks, facilitating multi-faceted rhetorical understanding and generation in essays.
Findings
Large Language Models perform best across tasks
Joint fine-tuning improves model performance
CERD enables understanding and generation of rhetorical devices
Abstract
Existing rhetorical understanding and generation datasets or corpora primarily focus on single coarse-grained categories or fine-grained categories, neglecting the common interrelations between different rhetorical devices by treating them as independent sub-tasks. In this paper, we propose the Chinese Essay Rhetoric Dataset (CERD), consisting of 4 commonly used coarse-grained categories including metaphor, personification, hyperbole and parallelism and 23 fine-grained categories across both form and content levels. CERD is a manually annotated and comprehensive Chinese rhetoric dataset with five interrelated sub-tasks. Unlike previous work, our dataset aids in understanding various rhetorical devices, recognizing corresponding rhetorical components, and generating rhetorical sentences under given conditions, thereby improving the author's writing proficiency and language usage skills.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsComputational and Text Analysis Methods · Discourse Analysis in Language Studies · Topic Modeling
MethodsFocus
