Bipol: A Novel Multi-Axes Bias Evaluation Metric with Explainability for   NLP

Lama Alkhaled; Tosin Adewumi; Sana Sabah Sabry

arXiv:2304.04029·cs.CL·September 19, 2023·1 cites

Bipol: A Novel Multi-Axes Bias Evaluation Metric with Explainability for NLP

Lama Alkhaled, Tosin Adewumi, Sana Sabah Sabry

PDF

Open Access 1 Repo

TL;DR

Bipol is a new explainable metric for social bias detection in NLP that combines corpus-level classification and sentence-level term frequency analysis, supported by new bias detection models and a large labeled dataset.

Contribution

The paper introduces bipol, a novel bias evaluation metric with explainability, and provides new bias detection models along with a large labeled dataset for NLP.

Findings

01

Bipol effectively estimates social bias in NLP datasets.

02

New bias detection models outperform existing methods.

03

A large dataset with nearly 2 million labeled samples is made publicly available.

Abstract

We introduce bipol, a new metric with explainability, for estimating social bias in text data. Harmful bias is prevalent in many online sources of data that are used for training machine learning (ML) models. In a step to address this challenge we create a novel metric that involves a two-step process: corpus-level evaluation based on model classification and sentence-level evaluation based on (sensitive) term frequency (TF). After creating new models to detect bias along multiple axes using SotA architectures, we evaluate two popular NLP datasets (COPA and SQUAD). As additional contribution, we created a large dataset (with almost 2 million labelled samples) for training models in bias detection and make it publicly available. We also make public our codes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tosingithub/bipol
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining