Sui Generis: Large Language Models for Authorship Attribution and   Verification in Latin

Gleb Schmidt; Svetlana Gorovaia; Ivan P. Yamshchikov

arXiv:2410.09245·cs.CL·October 15, 2024

Sui Generis: Large Language Models for Authorship Attribution and Verification in Latin

Gleb Schmidt, Svetlana Gorovaia, Ivan P. Yamshchikov

PDF

Open Access

TL;DR

This study assesses Large Language Models' effectiveness in Latin authorship attribution and verification, revealing their robustness in zero-shot tasks but also their susceptibility to semantic misleading and challenges in nuanced analysis.

Contribution

It demonstrates the capabilities and limitations of LLMs in Latin authorship tasks, highlighting differences from high-resource modern language studies and emphasizing the need for extensive experimentation.

Findings

01

LLMs perform well in zero-shot authorship verification on Latin texts.

02

Models can be misled by semantic content, affecting accuracy.

03

Steering LLMs for nuanced decisions remains challenging.

Abstract

This paper evaluates the performance of Large Language Models (LLMs) in authorship attribution and authorship verification tasks for Latin texts of the Patristic Era. The study showcases that LLMs can be robust in zero-shot authorship verification even on short texts without sophisticated feature engineering. Yet, the models can also be easily "mislead" by semantics. The experiments also demonstrate that steering the model's authorship analysis and decision-making is challenging, unlike what is reported in the studies dealing with high-resource modern languages. Although LLMs prove to be able to beat, under certain circumstances, the traditional baselines, obtaining a nuanced and truly explainable decision requires at best a lot of experimentation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Natural Language Processing Techniques