Not always about you: Prioritizing community needs when developing endangered language technology
Zoey Liu, Crystal Richardson, Richard Hatcher Jr, Emily, Prud'hommeaux

TL;DR
This paper discusses the unique challenges and ethical considerations in developing language technology for endangered languages, emphasizing community collaboration and cultural sensitivity.
Contribution
It highlights the importance of community needs and ethical collaboration in creating language technology for endangered languages, offering practical recommendations.
Findings
Community involvement is crucial for successful language technology development.
Ethical challenges include respecting cultural values and community priorities.
Collaborative approaches improve resource development and language revitalization.
Abstract
Languages are classified as low-resource when they lack the quantity of data necessary for training statistical and machine learning tools and models. Causes of resource scarcity vary but can include poor access to technology for developing these resources, a relatively small population of speakers, or a lack of urgency for collecting such resources in bilingual populations where the second language is high-resource. As a result, the languages described as low-resource in the literature are as different as Finnish on the one hand, with millions of speakers using it in every imaginable domain, and Seneca, with only a small-handful of fluent speakers using the language primarily in a restricted domain. While issues stemming from the lack of resources necessary to train models unite this disparate group of languages, many other issues cut across the divide between widely-spoken low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
