Learnings from Technological Interventions in a Low Resource Language: Enhancing Information Access in Gondi
Devansh Mehta, Harshita Diddee, Ananya Saxena, Anurag Shukla, Sebastin, Santy, Ramaravind Kommiya Mothilal, Brij Mohan Lal Srivastava, Alok Sharma,, Vishnu Prasad, Venkanna U, Kalika Bali

TL;DR
This paper details the development of linguistic resources and a machine translation model for Gondi, a low-resource language, using innovative data collection and compression techniques to enhance digital access and community engagement.
Contribution
It introduces a scalable data collection method, a compressed machine translation model, and community engagement strategies for Gondi, a low-resource language.
Findings
Created a 26,240 translation corpus for Gondi.
Developed a compressed Hindi-Gondi translation model suitable for edge devices.
Engaged 850 community members in language digitalization efforts.
Abstract
The primary obstacle to developing technologies for low-resource languages is the lack of representative, usable data. In this paper, we report the deployment of technology-driven data collection methods for creating a corpus of more than 60,000 translations from Hindi to Gondi, a low-resource vulnerable language spoken by around 2.3 million tribal people in south and central India. During this process, we help expand information access in Gondi across 2 different dimensions (a) The creation of linguistic resources that can be used by the community, such as a dictionary, children's stories, Gondi translations from multiple sources and an Interactive Voice Response (IVR) based mass awareness platform; (b) Enabling its use in the digital domain by developing a Hindi-Gondi machine translation model, which is compressed by nearly 4 times to enable it's edge deployment on low-resource edge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsICT in Developing Communities
