Soft-prompt Tuning for Large Language Models to Evaluate Bias

Jacob-Junqi Tian; David Emerson; Sevil Zanjani Miyandoab; Deval; Pandya; Laleh Seyyed-Kalantari; Faiza Khan Khattak

arXiv:2306.04735·cs.CL·March 6, 2024·2 cites

Soft-prompt Tuning for Large Language Models to Evaluate Bias

Jacob-Junqi Tian, David Emerson, Sevil Zanjani Miyandoab, Deval, Pandya, Laleh Seyyed-Kalantari, Faiza Khan Khattak

PDF

Open Access

TL;DR

This paper investigates using soft-prompt tuning on large language models to evaluate and identify biases in sentiment classification tasks, aiming to reduce human bias in prompt design.

Contribution

It introduces a bias evaluation method using soft-prompts that avoids human bias injection and provides insights into model biases across sensitive attributes.

Findings

01

Identified bias patterns in LLMs for different sensitive attributes

02

Demonstrated effectiveness of soft-prompts in bias evaluation

03

Open-sourced the bias evaluation pipeline

Abstract

Prompting large language models has gained immense popularity in recent years due to the advantage of producing good results even without the need for labelled data. However, this requires prompt tuning to get optimal prompts that lead to better model performances. In this paper, we explore the use of soft-prompt tuning on sentiment classification task to quantify the biases of large language models (LLMs) such as Open Pre-trained Transformers (OPT) and Galactica language model. Since these models are trained on real-world data that could be prone to bias toward certain groups of populations, it is important to identify these underlying issues. Using soft-prompts to evaluate bias gives us the extra advantage of avoiding the human-bias injection that can be caused by manually designed prompts. We check the model biases on different sensitive attributes using the group fairness (bias) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

MethodsGalactica