Model Debiasing by Learnable Data Augmentation
Pietro Morerio, Ruggero Ragonesi, Vittorio Murino

TL;DR
This paper introduces a novel two-stage data augmentation method to improve neural network generalization on biased datasets, effectively reducing reliance on spurious correlations without requiring bias annotations.
Contribution
The work presents a bias-agnostic data augmentation pipeline that identifies biased samples and enhances model robustness, outperforming existing methods on synthetic and real datasets.
Findings
Achieves state-of-the-art accuracy on biased datasets
Improves generalization regardless of bias level
Robust performance on both synthetic and real-world data
Abstract
Deep Neural Networks are well known for efficiently fitting training data, yet experiencing poor generalization capabilities whenever some kind of bias dominates over the actual task labels, resulting in models learning "shortcuts". In essence, such models are often prone to learn spurious correlations between data and labels. In this work, we tackle the problem of learning from biased data in the very realistic unsupervised scenario, i.e., when the bias is unknown. This is a much harder task as compared to the supervised case, where auxiliary, bias-related annotations, can be exploited in the learning process. This paper proposes a novel 2-stage learning pipeline featuring a data augmentation strategy able to regularize the training. First, biased/unbiased samples are identified by training over-biased models. Second, such subdivision (typically noisy) is exploited within a data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Processing Techniques
