A Study of Nationality Bias in Names and Perplexity using Off-the-Shelf   Affect-related Tweet Classifiers

Valentin Barriere; Sebastian Cifuentes

arXiv:2407.01834·cs.CL·November 26, 2024

A Study of Nationality Bias in Names and Perplexity using Off-the-Shelf Affect-related Tweet Classifiers

Valentin Barriere, Sebastian Cifuentes

PDF

Open Access 1 Repo 1 Video

TL;DR

This study investigates how country names influence affect-related tweet classifier predictions, revealing biases linked to language and training data, especially affecting English and less-resourced languages.

Contribution

It introduces a counterfactual perturbation method for bias detection in classifiers and analyzes the impact of country names on predictions across multiple affect-related tasks.

Findings

01

Country names significantly affect classifier predictions, up to 23% in hate speech detection.

02

Biases are linked to training data of pre-trained language models, especially for English.

03

Correlations between affect predictions and language likelihoods reveal language-specific biases.

Abstract

In this paper, we apply a method to quantify biases associated with named entities from various countries. We create counterfactual examples with small perturbations on target-domain data instead of relying on templates or specific datasets for bias detection. On widely used classifiers for subjectivity analysis, including sentiment, emotion, hate speech, and offensive text using Twitter data, our results demonstrate positive biases related to the language spoken in a country across all classifiers studied. Notably, the presence of certain country names in a sentence can strongly influence predictions, up to a 23\% change in hate speech detection and up to a 60\% change in the prediction of negative emotions such as anger. We hypothesize that these biases stem from the training data of pre-trained language models (PLMs) and find correlations between affect predictions and PLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

valbarriere/biases_ppl
noneOfficial

Videos

A Study of Nationality Bias in Names and Perplexity using Off-the-Shelf Affect-related Tweet Classifiers· underline

Taxonomy

TopicsComputational and Text Analysis Methods