Comparing LLM prompting with Cross-lingual transfer performance on   Indigenous and Low-resource Brazilian Languages

David Ifeoluwa Adelani; A. Seza Do\u{g}ru\"oz; Andr\'e Coneglian; Atul; Kr. Ojha

arXiv:2404.18286·cs.CL·May 1, 2024

Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages

David Ifeoluwa Adelani, A. Seza Do\u{g}ru\"oz, Andr\'e Coneglian, Atul, Kr. Ojha

PDF

Open Access 1 Video

TL;DR

This paper evaluates how large language models perform on NLP tasks for low-resource indigenous and Brazilian languages, revealing performance gaps compared to high-resource languages and analyzing underlying reasons.

Contribution

It provides a comparative analysis of LLM prompting effectiveness on 12 low-resource languages from Brazil and Africa, highlighting challenges and error patterns.

Findings

01

LLMs perform worse on POS tagging for low-resource languages

02

Error analysis reveals specific linguistic and data-related challenges

03

Performance gap is significant compared to high-resource languages

Abstract

Large Language Models are transforming NLP for a variety of tasks. However, how LLMs perform NLP tasks for low-resource languages (LRLs) is less explored. In line with the goals of the AmericasNLP workshop, we focus on 12 LRLs from Brazil, 2 LRLs from Africa and 2 high-resource languages (HRLs) (e.g., English and Brazilian Portuguese). Our results indicate that the LLMs perform worse for the part of speech (POS) labeling of LRLs in comparison to HRLs. We explain the reasons behind this failure and provide an error analysis through examples observed in our data set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Comparing LLM prompting with Cross-lingual transfer performance on Indigenous and Low-resource Brazilian Languages· underline

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling

MethodsFocus