Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages
David Ifeoluwa Adelani, A. Seza Do\u{g}ru\"oz, Andr\'e Coneglian, Atul, Kr. Ojha

TL;DR
This paper evaluates how large language models perform on NLP tasks for low-resource indigenous and Brazilian languages, revealing performance gaps compared to high-resource languages and analyzing underlying reasons.
Contribution
It provides a comparative analysis of LLM prompting effectiveness on 12 low-resource languages from Brazil and Africa, highlighting challenges and error patterns.
Findings
LLMs perform worse on POS tagging for low-resource languages
Error analysis reveals specific linguistic and data-related challenges
Performance gap is significant compared to high-resource languages
Abstract
Large Language Models are transforming NLP for a variety of tasks. However, how LLMs perform NLP tasks for low-resource languages (LRLs) is less explored. In line with the goals of the AmericasNLP workshop, we focus on 12 LRLs from Brazil, 2 LRLs from Africa and 2 high-resource languages (HRLs) (e.g., English and Brazilian Portuguese). Our results indicate that the LLMs perform worse for the part of speech (POS) labeling of LRLs in comparison to HRLs. We explain the reasons behind this failure and provide an error analysis through examples observed in our data set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
MethodsFocus
