Rdgai: Classifying transcriptional changes using Large Language Models with a test case from an Arabic Gospel tradition
Robert Turnbull

TL;DR
Rdgai is a software tool that leverages large language models to automate the classification of textual variants in phylogenetic analysis, demonstrated on an Arabic Gospel tradition.
Contribution
It introduces an automated method using multilingual LLMs for classifying textual changes, reducing manual effort in phylogenetic studies.
Findings
Automates classification of textual variants using LLMs.
Stores classifications in TEI XML format.
Applied successfully to Arabic Gospel data.
Abstract
Application of phylogenetic methods to textual traditions has traditionally treated all changes as equivalent even though it is widely recognized that certain types of variants were more likely to be introduced than others. While it is possible to give weights to certain changes using a maximum parsimony evaluation criterion, it is difficult to state a priori what these weights should be. Probabilistic methods, such as Bayesian phylogenetics, allow users to create categories of changes, and the transition rates for each category can be estimated as part of the analysis. This classification of types of changes in readings also allows for inspecting the probability of these categories across each branch in the resulting trees. However, classification of readings is time-consuming, as it requires categorizing each reading against every other reading at each variation unit, presenting a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship · Authorship Attribution and Profiling · Language and cultural evolution
