TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and   Historical Languages

Aleksei Dorkin; Kairit Sirts

arXiv:2404.12845·cs.CL·December 10, 2024

TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages

Aleksei Dorkin, Kairit Sirts

PDF

Open Access

TL;DR

This paper describes a lightweight adapter-based approach to adapt XLM-RoBERTa for various NLP tasks in ancient and historical languages, demonstrating competitive performance across multiple languages.

Contribution

The paper introduces a uniform, parameter-efficient fine-tuning method using adapters for multiple NLP tasks in ancient and historical languages, achieving competitive results.

Findings

01

Second place overall in shared task

02

First place in word-level gap-filling

03

Feasibility of adapting modern language models to historical languages

Abstract

We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, character- and word-level gap-filling. We developed a simple, uniform, and computationally lightweight approach based on the adapters framework using parameter-efficient fine-tuning. We applied the same adapter-based approach uniformly to all tasks and 16 languages by fine-tuning stacked language- and task-specific adapters. Our submission obtained an overall second place out of three submissions, with the first place in word-level gap-filling. Our results show the feasibility of adapting language models pre-trained on modern languages to historical and ancient languages via adapter training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing · Topic Modeling

MethodsAdapter