Is It Navajo? Accurate Language Detection in Endangered Athabaskan Languages
Ivory Yang, Weicheng Ma, Chunhui Zhang, Soroush Vosoughi

TL;DR
This paper presents a highly accurate random forest classifier for Navajo language detection, addressing the lack of native support in existing tools and demonstrating potential for broader application to related Athabaskan languages.
Contribution
Introduces a simple yet effective classifier trained on Navajo and similar languages, improving language detection for endangered Athabaskan languages and highlighting the need for culturally inclusive NLP tools.
Findings
Achieves 97-100% accuracy in Navajo detection
Demonstrates robustness across related Athabaskan languages
Highlights the importance of linguistic diversity in NLP
Abstract
Endangered languages, such as Navajo - the most widely spoken Native American language - are significantly underrepresented in contemporary language technologies, exacerbating the challenges of their preservation and revitalization. This study evaluates Google's Language Identification (LangID) tool, which does not currently support any Native American languages. To address this, we introduce a random forest classifier trained on Navajo and twenty erroneously suggested languages by LangID. Despite its simplicity, the classifier achieves near-perfect accuracy (97-100%). Additionally, the model demonstrates robustness across other Athabaskan languages - a family of Native American languages spoken primarily in Alaska, the Pacific Northwest, and parts of the Southwestern United States - suggesting its potential for broader application. Our findings underscore the pressing need for NLP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)
