AI-Driven Generation of Old English: A Framework for Low-Resource Languages

Rodrigo Gabriel Salazar Alva; Mat\'ias Nu\~nez; Cristian L\'opez; Javier Mart\'in Arista

arXiv:2507.20111·cs.CL·July 29, 2025

AI-Driven Generation of Old English: A Framework for Low-Resource Languages

Rodrigo Gabriel Salazar Alva, Mat\'ias Nu\~nez, Cristian L\'opez, Javier Mart\'in Arista

PDF

TL;DR

This paper introduces a scalable AI framework that leverages large language models and innovative techniques to generate high-quality Old English texts, aiding in the preservation of this endangered language.

Contribution

It presents a novel combination of parameter-efficient fine-tuning, data augmentation, and a dual-agent pipeline for Old English text generation, advancing low-resource language NLP.

Findings

01

BLEU scores increased from 26 to over 65

02

High grammatical accuracy confirmed by experts

03

Method effectively expands Old English corpus

Abstract

Preserving ancient languages is essential for understanding humanity's cultural and linguistic heritage, yet Old English remains critically under-resourced, limiting its accessibility to modern natural language processing (NLP) techniques. We present a scalable framework that uses advanced large language models (LLMs) to generate high-quality Old English texts, addressing this gap. Our approach combines parameter-efficient fine-tuning (Low-Rank Adaptation, LoRA), data augmentation via backtranslation, and a dual-agent pipeline that separates the tasks of content generation (in English) and translation (into Old English). Evaluation with automated metrics (BLEU, METEOR, and CHRF) shows significant improvements over baseline models, with BLEU scores increasing from 26 to over 65 for English-to-Old English translation. Expert human assessment also confirms high grammatical accuracy and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.