Generative AI and Large Language Models in Language Preservation: Opportunities and Challenges
Vincent Koc

TL;DR
This paper presents a framework for applying Generative AI and Large Language Models to language preservation, emphasizing community governance, ethical safeguards, and practical evaluation, demonstrated through the revitalization of Te Reo Māori.
Contribution
It introduces a novel analytical framework for evaluating GenAI applications in language preservation, integrating community needs and ethical considerations, with practical validation on Te Reo Māori.
Findings
Achieved 92% accuracy in community-led speech recognition
Identified challenges in data sovereignty and model bias
Demonstrated the framework's effectiveness in real-world revitalization efforts
Abstract
The global crisis of language endangerment meets a technological turning point as Generative AI (GenAI) and Large Language Models (LLMs) unlock new frontiers in automating corpus creation, transcription, translation, and tutoring. However, this promise is imperiled by fragmented practices and the critical lack of a methodology to navigate the fraught balance between LLM capabilities and the profound risks of data scarcity, cultural misappropriation, and ethical missteps. This paper introduces a novel analytical framework that systematically evaluates GenAI applications against language-specific needs, embedding community governance and ethical safeguards as foundational pillars. We demonstrate its efficacy through the Te Reo M\=aori revitalization, where it illuminates successes, such as community-led Automatic Speech Recognition achieving 92% accuracy, while critically surfacing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
