TL;DR
FairFlow is an automated, model-based method for generating high-quality counterfactual data to reduce societal biases in NLP models, overcoming limitations of previous dictionary-based approaches.
Contribution
It introduces an automated approach to generate parallel data for counterfactual augmentation, reducing reliance on manual data and improving quality over dictionary-based methods.
Findings
Outperforms dictionary-based substitution in quality and context relevance.
Reduces need for manual parallel data collection.
Effectively mitigates societal biases in NLP models.
Abstract
Despite the evolution of language models, they continue to portray harmful societal biases and stereotypes inadvertently learned from training data. These inherent biases often result in detrimental effects in various applications. Counterfactual Data Augmentation (CDA), which seeks to balance demographic attributes in training data, has been a widely adopted approach to mitigate bias in natural language processing. However, many existing CDA approaches rely on word substitution techniques using manually compiled word-pair dictionaries. These techniques often lead to out-of-context substitutions, resulting in potential quality issues. The advancement of model-based techniques, on the other hand, has been challenged by the need for parallel training data. Works in this area resort to manually generated parallel data that are expensive to collect and are consequently limited in scale.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
