Context-Aware Counterfactual Data Augmentation for Gender Bias Mitigation in Language Models

Shweta Parihar; Liu Guangliang; Natalie Parde; Lu Cheng

arXiv:2602.09590·cs.CL·February 11, 2026

Context-Aware Counterfactual Data Augmentation for Gender Bias Mitigation in Language Models

Shweta Parihar, Liu Guangliang, Natalie Parde, Lu Cheng

PDF

Open Access

TL;DR

This paper introduces Context-CDA, a context-aware counterfactual data augmentation method that effectively reduces gender bias in language models without degrading their language modeling capabilities, by enhancing data diversity and relevance.

Contribution

The paper presents a novel context-augmented CDA approach that improves bias mitigation while maintaining language model performance, addressing limitations of existing counterfactual augmentation methods.

Findings

01

Effective gender bias mitigation demonstrated on benchmarks.

02

Maintains language modeling performance post-debiasing.

03

Provides insights into social biases through distribution analysis.

Abstract

A challenge in mitigating social bias in fine-tuned language models (LMs) is the potential reduction in language modeling capability, which can harm downstream performance. Counterfactual data augmentation (CDA), a widely used method for fine-tuning, highlights this issue by generating synthetic data that may align poorly with real-world distributions or creating overly simplistic counterfactuals that ignore the social context of altered sensitive attributes (e.g., gender) in the pretraining corpus. To address these limitations, we propose a simple yet effective context-augmented CDA method, Context-CDA, which uses large LMs to enhance the diversity and contextual relevance of the debiasing corpus. By minimizing discrepancies between the debiasing corpus and pretraining data through augmented context, this approach ensures better alignment, enhancing language modeling capability. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Ethics and Social Impacts of AI · Artificial Intelligence in Healthcare and Education