Computational Paremiology: Charting the temporal, ecological dynamics of proverb use in books, news articles, and tweets
E. Davis, C. M. Danforth, W. Mieder, and P. S. Dodds

TL;DR
This study quantitatively analyzes how proverb usage varies over time across diverse media, revealing patterns and cultural shifts through large-scale corpus analysis of books, news, and social media.
Contribution
It introduces a large-scale, dynamic analysis of proverb frequency changes across multiple genres and time periods, highlighting cultural and linguistic evolution.
Findings
Proverbs follow heavy-tailed frequency distributions in all media.
Proverb usage trends reflect cultural dynamics over time.
Proverbs have evolved into modern forms on social media.
Abstract
Proverbs are an essential component of language and culture, and though much attention has been paid to their history and currency, there has been comparatively little quantitative work on changes in the frequency with which they are used over time. With wider availability of large corpora reflecting many diverse genres of documents, it is now possible to take a broad and dynamic view of the importance of the proverb. Here, we measure temporal changes in the relevance of proverbs within three corpora, differing in kind, scale, and time frame: Millions of books over centuries; hundreds of millions of news articles over twenty years; and billions of tweets over a decade. We find that proverbs present heavy-tailed frequency-of-usage rank distributions in each venue; exhibit trends reflecting the cultural dynamics of the eras covered; and have evolved into contemporary forms on social media.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage, Metaphor, and Cognition · Language and cultural evolution · Natural Language Processing Techniques
