Natural Language Processing for Tigrinya: Current State and Future Directions
Fitsum Gaim, Jong C. Park

TL;DR
This survey reviews the progress and challenges of NLP for Tigrinya, highlighting resource development, model evolution, and future research directions to advance language technology for this underrepresented language.
Contribution
It provides the first comprehensive overview of Tigrinya NLP research, analyzing over 50 studies and proposing future directions for resource creation and model development.
Findings
Shift from rule-based to neural models
Resource scarcity hinders progress
Promising directions include cross-lingual transfer
Abstract
Despite being spoken by millions of people, Tigrinya remains severely underrepresented in Natural Language Processing (NLP) research. This work presents a comprehensive survey of NLP research for Tigrinya, analyzing over 50 studies from 2011 to 2025. We systematically review the current state of computational resources, models, and applications across fifteen downstream tasks, including morphological processing, part-of-speech tagging, named entity recognition, machine translation, question-answering, speech recognition, and synthesis. Our analysis reveals a clear trajectory from foundational, rule-based systems to modern neural architectures, with progress consistently driven by milestones in resource creation. We identify key challenges rooted in Tigrinya's morphological properties and resource scarcity, and highlight promising research directions, including morphology-aware modeling,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
