Character Distributions of Classical Chinese Literary Texts: Zipf's Law, Genres, and Epochs
Chao-Lin Liu, Shuhua Zhang, Yuanli Geng, Huei-ling Lai, Hongsu Wang

TL;DR
This study analyzes character distributions in Chinese literary texts spanning over 3,000 years, examining how genres and epochs influence Zipf's law deviations and similarities across different periods and styles.
Contribution
It provides a comprehensive analysis of Chinese literary texts' character distributions across multiple genres and epochs, revealing patterns and deviations in Zipfian curves.
Findings
Poetic works from 618 to 1644 CE show similar character distribution patterns.
Texts within the same dynasty use similar characters but have distinct distributions.
Genre and epoch significantly influence character distribution patterns.
Abstract
We collect 14 representative corpora for major periods in Chinese history in this study. These corpora include poetic works produced in several dynasties, novels of the Ming and Qing dynasties, and essays and news reports written in modern Chinese. The time span of these corpora ranges between 1046 BCE and 2007 CE. We analyze their character and word distributions from the viewpoint of the Zipf's law, and look for factors that affect the deviations and similarities between their Zipfian curves. Genres and epochs demonstrated their influences in our analyses. Specifically, the character distributions for poetic works of between 618 CE and 1644 CE exhibit striking similarity. In addition, although texts of the same dynasty may tend to use the same set of characters, their character distributions still deviate from each other.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Authorship Attribution and Profiling · Language and cultural evolution
