Exploring Large Language Models for Classical Philology
Frederick Riemenschneider, Anette Frank

TL;DR
This paper develops and benchmarks four new language models for Ancient Greek, exploring different architectures and multilingual settings, and demonstrates their superior performance on morphological and syntactic tasks, providing valuable resources for Classical Philology.
Contribution
It introduces four novel Ancient Greek language models with varied architectures and multilingual training, and provides the first comprehensive benchmarking and resources for Classical language modeling.
Findings
Models outperform previous state-of-the-art results.
T5's decoding enhances lemmatization performance.
Systematic analysis informs future model design for Classical languages.
Abstract
Recent advances in NLP have led to the creation of powerful language models for many languages including Ancient Greek and Latin. While prior work on Classical languages unanimously uses BERT, in this work we create four language models for Ancient Greek that vary along two dimensions to study their versatility for tasks of interest for Classical languages: we explore (i) encoder-only and encoder-decoder architectures using RoBERTa and T5 as strong model types, and create for each of them (ii) a monolingual Ancient Greek and a multilingual instance that includes Latin and English. We evaluate all models on morphological and syntactic tasks, including lemmatization, which demonstrates the added value of T5's decoding abilities. We further define two probing tasks to investigate the knowledge acquired by models pre-trained on Classical texts. Our experiments provide the first benchmarking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsGated Linear Unit · Attention Is All You Need · Adafactor · Refunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Softmax · Layer Normalization · Inverse Square Root Schedule · Byte Pair Encoding · Dropout
