Adapting Definition Modeling for New Languages: A Case Study on Belarusian

Daniela Kazakouskaya; Timothee Mickus; Janine Siewert

arXiv:2507.09536·cs.CL·July 15, 2025

Adapting Definition Modeling for New Languages: A Case Study on Belarusian

Daniela Kazakouskaya, Timothee Mickus, Janine Siewert

PDF

Open Access 1 Video

TL;DR

This paper explores adapting definition modeling to Belarusian, demonstrating that existing models can be effectively transferred with minimal data, though current metrics have limitations in capturing all aspects of quality.

Contribution

The study introduces a new Belarusian dataset for definition modeling and shows how existing models can be adapted with limited data for under-resourced languages.

Findings

01

Effective adaptation with minimal data

02

Gaps identified in automatic metric evaluations

03

New Belarusian definition dataset created

Abstract

Definition modeling, the task of generating new definitions for words in context, holds great prospect as a means to assist the work of lexicographers in documenting a broader variety of lects and languages, yet much remains to be done in order to assess how we can leverage pre-existing models for as-of-yet unsupported languages. In this work, we focus on adapting existing models to Belarusian, for which we propose a novel dataset of 43,150 definitions. Our experiments demonstrate that adapting a definition modeling systems requires minimal amounts of data, but that there currently are gaps in what automatic metrics do capture.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Adapting Definition Modeling for New Languages: A Case Study on Belarusian· underline

Taxonomy

Topicslinguistics and terminology studies