$\pi$-yalli: un nouveau corpus pour le nahuatl
Juan-Manuel Torres-Moreno, Juan-Jos\'e Guzm\'an-Landa, Graham Ranger,, Martha Lorena Avenda\~no Garrido, Miguel Figueroa-Saavedra, Ligia, Quintana-Torres, Carlos-Emiliano Gonz\'alez-Gallardo, Elvys Linhares Pontes,, Patricia Vel\'azquez Morales, Luis-Gil Moreno Jim\'enez

TL;DR
This paper introduces the $ ext{ extpi}$-YALLI corpus, a new resource for Nahuatl language processing, aiming to facilitate NLP tool development and language model research for this under-resourced language.
Contribution
The creation of the $ ext{ extpi}$-YALLI corpus specifically designed for machine learning applications in Nahuatl is a novel resource to support NLP research and tool development.
Findings
Development of a Nahuatl corpus for NLP research
Potential to enable various NLP tools for Nahuatl
Foundation for future language model training
Abstract
The NAHU project is a Franco-Mexican collaboration aimed at building the -YALLI corpus adapted to machine learning, which will subsequently be used to develop computer resources for the Nahuatl language. Nahuatl is a language with few computational resources, even though it is a living language spoken by around 2 million people. We have decided to build -YALLI, a corpus that will enable to carry out research on Nahuatl in order to develop Language Models (LM), whether dynamic or not, which will make it possible to in turn enable the development of Natural Language Processing (NLP) tools such as: a) a grapheme unifier, b) a word segmenter, c) a POS grammatical analyser, d) a content-based Automatic Text Summarization; and possibly, e) a translator translator (probabilistic or learning-based).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpanish Linguistics and Language Studies · Linguistic Studies and Language Acquisition · Historical Linguistics and Language Studies
