Identification of Literary Movements Using Complex Networks to Represent Texts
Diego R. Amancio, Osvaldo N. Oliveira Jr., Luciano da F. Costa

TL;DR
This study uses complex network analysis of texts from 1590 to 1922 to identify and characterize major literary movements over five centuries, revealing stylistic trends and cultural shifts.
Contribution
It introduces a novel method of representing literary texts as complex networks and applies multivariate analysis to uncover historical literary patterns.
Findings
Six clusters of books correspond to historical literary movements.
Average shortest path length correlates with syntactic complexity.
Stylistic changes are driven by opposition to earlier styles.
Abstract
The use of statistical methods to analyze large databases of text has been useful to unveil patterns of human behavior and establish historical links between cultures and languages. In this study, we identify literary movements by treating books published from 1590 to 1922 as complex networks, whose metrics were analyzed with multivariate techniques to generate six clusters of books. The latter correspond to time periods coinciding with relevant literary movements over the last 5 centuries. The most important factor contributing to the distinction between different literary styles was {the average shortest path length (particularly, the asymmetry of the distribution)}. Furthermore, over time there has been a trend toward larger average shortest path lengths, which is correlated with increased syntactic complexity, and a more uniform use of the words reflected in a smaller power-law…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
