Performance in a dialectal profiling task of LLMs for varieties of Brazilian Portuguese
Raquel Meister Ko Freitag, T\'ulio Sousa de Gois

TL;DR
This study investigates how large language models recognize and discriminate between dialectal varieties of Brazilian Portuguese, revealing biases and sociolinguistic considerations in their responses.
Contribution
It provides an analysis of dialectal biases in four major LLMs using prompt engineering, contributing sociolinguistic insights for more equitable NLP technology.
Findings
LLMs show varying degrees of dialectal discrimination.
Prompt engineering can reveal underlying biases.
Sociolinguistic rules influence model responses.
Abstract
Different of biases are reproduced in LLM-generated responses, including dialectal biases. A study based on prompt engineering was carried out to uncover how LLMs discriminate varieties of Brazilian Portuguese, specifically if sociolinguistic rules are taken into account in four LLMs: GPT 3.5, GPT-4o, Gemini, and Sabi.-2. The results offer sociolinguistic contributions for an equity fluent NLP technology.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Multi-Head Attention · Dense Connections · Residual Connection · Dropout · Layer Normalization · Linear Warmup With Cosine Annealing
