Mind the Language Gap in Digital Humanities: LLM-Aided Translation of SKOS Thesauri
Felix Kraus, Nicolas Blumenr\"ohr, Danah Tonne, Achim Streit

TL;DR
This paper presents WOKIE, an open-source pipeline that automates the translation of SKOS thesauri using external translation services and LLMs, improving multilingual access and interoperability in Digital Humanities.
Contribution
The work introduces WOKIE, a modular, easy-to-use system for translating knowledge resources in multiple languages, combining translation services with LLM-based refinement.
Findings
WOKIE effectively translates SKOS thesauri in 15 languages.
It enhances ontology matching and cross-lingual interoperability.
The system is accessible on standard hardware and requires no specialized expertise.
Abstract
We introduce WOKIE, an open-source, modular, and ready-to-use pipeline for the automated translation of SKOS thesauri. This work addresses a critical need in the Digital Humanities (DH), where language diversity can limit access, reuse, and semantic interoperability of knowledge resources. WOKIE combines external translation services with targeted refinement using Large Language Models (LLMs), balancing translation quality, scalability, and cost. Designed to run on everyday hardware and be easily extended, the application requires no prior expertise in machine translation or LLMs. We evaluate WOKIE across several DH thesauri in 15 languages with different parameters, translation services and LLMs, systematically analysing translation quality, performance, and ontology matching improvements. Our results show that WOKIE is suitable to enhance the accessibility, reuse, and cross-lingual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
