Downstream bias mitigation is all you need

Arkadeep Baksi; Rahul Singh; Tarun Joshi

arXiv:2408.00612·cs.CL·August 29, 2024

Downstream bias mitigation is all you need

Arkadeep Baksi, Rahul Singh, Tarun Joshi

PDF

Open Access

TL;DR

This paper investigates bias in large language models, revealing that biases are more effectively mitigated by addressing domain-specific data during fine-tuning rather than through interventions on pre-trained models.

Contribution

It demonstrates that bias mitigation is more effective when applied to fine-tuning datasets rather than pre-trained models, emphasizing the importance of domain-specific data.

Findings

01

Biases in LLMs are minimally affected by pre-training interventions.

02

Bias mitigation at the fine-tuning stage has a larger impact.

03

Small changes in fine-tuning data co-occurrence rates significantly affect bias.

Abstract

The advent of transformer-based architectures and large language models (LLMs) have significantly advanced the performance of natural language processing (NLP) models. Since these LLMs are trained on huge corpuses of data from the web and other sources, there has been a major concern about harmful prejudices that may potentially be transferred from the data. In many applications, these pre-trained LLMs are fine-tuned on task specific datasets, which can further contribute to biases. This paper studies the extent of biases absorbed by LLMs during pre-training as well as task-specific behaviour after fine-tuning. We found that controlled interventions on pre-trained LLMs, prior to fine-tuning, have minimal effect on lowering biases in classifiers. However, the biases present in domain-specific datasets play a much bigger role, and hence mitigating them at this stage has a bigger impact.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods in Clinical Trials · Cardiac electrophysiology and arrhythmias