A Stylometric Application of Large Language Models
Harrison F. Stropkay, Jiayi Chen, Mohammad J. Latifi, Daniel N. Rockmore, Jeremy R. Manning

TL;DR
This paper demonstrates that large language models, specifically GPT-2 trained on individual authors, can effectively identify and distinguish unique author writing styles, with applications in authorship attribution and stylometry.
Contribution
It introduces a novel method of using LLMs trained on single authors to capture and identify distinctive writing styles for authorship attribution.
Findings
GPT-2 models trained on individual authors predict their texts more accurately.
The approach successfully attributes authorship in known and disputed cases.
The method confirms R. P. Thompson's authorship of a disputed Oz book.
Abstract
We show that large language models (LLMs) can be used to distinguish the writings of different authors. Specifically, an individual GPT-2 model, trained from scratch on the works of one author, will predict held-out text from that author more accurately than held-out text from other authors. We suggest that, in this way, a model trained on one author's works embodies the unique writing style of that author. We first demonstrate our approach on books written by eight different (known) authors. We also use this approach to confirm R. P. Thompson's authorship of the well-studied 15th book of the Oz series, originally attributed to F. L. Baum.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗contextlab/gpt2-baummodel· 4 dl· ♡ 14 dl♡ 1
- 🤗contextlab/gpt2-melvillemodel· 5 dl· ♡ 15 dl♡ 1
- 🤗contextlab/gpt2-twainmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗contextlab/gpt2-wellsmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗contextlab/gpt2-austenmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗contextlab/gpt2-fitzgeraldmodel· 5 dl· ♡ 15 dl♡ 1
- 🤗contextlab/gpt2-dickensmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗contextlab/gpt2-thompsonmodel· 5 dl· ♡ 15 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
