Assessing Dataset Bias in Computer Vision

Athiya Deviyani

arXiv:2205.01811·cs.CV·May 5, 2022

Assessing Dataset Bias in Computer Vision

Athiya Deviyani

PDF

TL;DR

This paper investigates how various data augmentation techniques can reduce dataset bias in facial attribute classification, improving model fairness and accuracy across diverse datasets.

Contribution

It compares multiple augmentation methods, identifies StarGAN as most effective, and demonstrates bias mitigation and improved performance over existing models.

Findings

01

StarGAN augmentation yields best overall performance.

02

Geometric transformations offer similar accuracy with faster training.

03

Models trained on augmented data show more uniform class performance.

Abstract

A biased dataset is a dataset that generally has attributes with an uneven class distribution. These biases have the tendency to propagate to the models that train on them, often leading to a poor performance in the minority class. In this project, we will explore the extent to which various data augmentation methods alleviate intrinsic biases within the dataset. We will apply several augmentation techniques on a sample of the UTKFace dataset, such as undersampling, geometric transformations, variational autoencoders (VAEs), and generative adversarial networks (GANs). We then trained a classifier for each of the augmented datasets and evaluated their performance on the native test set and on external facial recognition datasets. We have also compared their performance to the state-of-the-art attribute classifier trained on the FairFace dataset. Through experimentation, we were able to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.