A data science and machine learning approach to continuous analysis of Shakespeare's plays
Charles Swisher, Lior Shamir

TL;DR
This paper applies machine learning to analyze Shakespeare's plays, revealing stylistic changes over time and demonstrating the potential of quantitative text analysis in literary studies.
Contribution
It introduces a comprehensive machine learning approach to analyze Shakespeare's works, highlighting stylistic evolution and temporal predictions based on text features.
Findings
Significant stylistic changes over time in Shakespeare's plays.
Pearson correlation of 0.71 between actual and predicted play years.
Some plays' styles are more similar to different periods than their actual writing year.
Abstract
The availability of quantitative text analysis methods has provided new ways of analyzing literature in a manner that was not available in the pre-information era. Here we apply comprehensive machine learning analysis to the work of William Shakespeare. The analysis shows clear changes in the style of writing over time, with the most significant changes in the sentence length, frequency of adjectives and adverbs, and the sentiments expressed in the text. Applying machine learning to make a stylometric prediction of the year of the play shows a Pearson correlation of 0.71 between the actual and predicted year, indicating that Shakespeare's writing style as reflected by the quantitative measurements changed over time. Additionally, it shows that the stylometrics of some of the plays is more similar to plays written either before or after the year they were written. For instance, Romeo and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Computational and Text Analysis Methods
