An Empirical Study of Metrics to Measure Representational Harms in   Pre-Trained Language Models

Saghar Hosseini; Hamid Palangi; Ahmed Hassan Awadallah

arXiv:2301.09211·cs.CL·January 24, 2023

An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

Saghar Hosseini, Hamid Palangi, Ahmed Hassan Awadallah

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new metric to measure implicit societal biases in pre-trained language models, analyzing 24 models across demographics, and explores how model architecture influences bias mitigation.

Contribution

It proposes a novel metric for quantifying representational harms in PTLMs and provides an empirical analysis of biases across multiple models and architectures.

Findings

01

The new metric correlates with existing gender bias metrics.

02

Deeper models tend to have reduced biases compared to wider models.

03

Prioritizing depth over width can mitigate biases in PTLMs.

Abstract

Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from massive human-written data which contains latent societal biases and toxic contents. In this paper, we leverage the primary task of PTLMs, i.e., language modeling, and propose a new metric to quantify manifested implicit representational harms in PTLMs towards 13 marginalized demographics. Using this metric, we conducted an empirical analysis of 24 widely used PTLMs. Our analysis provides insights into the correlation between the proposed metric in this work and other related metrics for representational harm. We observe that our metric correlates with most of the gender-specific metrics in the literature. Through extensive experiments, we explore the connections between PTLMs architectures and representational harms across two dimensions: depth and width of the networks. We found that prioritizing depth over width,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/SafeNLP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling