Performance in a dialectal profiling task of LLMs for varieties of   Brazilian Portuguese

Raquel Meister Ko Freitag; T\'ulio Sousa de Gois

arXiv:2410.10991·cs.CL·January 6, 2025

Performance in a dialectal profiling task of LLMs for varieties of Brazilian Portuguese

Raquel Meister Ko Freitag, T\'ulio Sousa de Gois

PDF

TL;DR

This study investigates how large language models recognize and discriminate between dialectal varieties of Brazilian Portuguese, revealing biases and sociolinguistic considerations in their responses.

Contribution

It provides an analysis of dialectal biases in four major LLMs using prompt engineering, contributing sociolinguistic insights for more equitable NLP technology.

Findings

01

LLMs show varying degrees of dialectal discrimination.

02

Prompt engineering can reveal underlying biases.

03

Sociolinguistic rules influence model responses.

Abstract

Different of biases are reproduced in LLM-generated responses, including dialectal biases. A study based on prompt engineering was carried out to uncover how LLMs discriminate varieties of Brazilian Portuguese, specifically if sociolinguistic rules are taken into account in four LLMs: GPT 3.5, GPT-4o, Gemini, and Sabi.-2. The results offer sociolinguistic contributions for an equity fluent NLP technology.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Multi-Head Attention · Dense Connections · Residual Connection · Dropout · Layer Normalization · Linear Warmup With Cosine Annealing