DASH: Visual Analytics for Debiasing Image Classification via   User-Driven Synthetic Data Augmentation

Bum Chul Kwon; Jungsoo Lee; Chaeyeon Chung; Nyoungwoo Lee; Ho-Jin; Choi; Jaegul Choo

arXiv:2209.06357·cs.HC·September 15, 2022

DASH: Visual Analytics for Debiasing Image Classification via User-Driven Synthetic Data Augmentation

Bum Chul Kwon, Jungsoo Lee, Chaeyeon Chung, Nyoungwoo Lee, Ho-Jin, Choi, Jaegul Choo

PDF

TL;DR

DASH is a visual analytics system that enables human-in-the-loop debiasing of image classifiers by identifying bias factors and generating synthetic data to improve model fairness and accuracy.

Contribution

The paper introduces DASH, a novel visual analytics tool that supports human-guided bias mitigation in image classification through bias factor identification and synthetic data augmentation.

Findings

01

DASH effectively helps users identify bias factors in models.

02

Synthetic data augmentation improves classification accuracy.

03

User studies show DASH's usefulness in bias mitigation.

Abstract

Image classification models often learn to predict a class based on irrelevant co-occurrences between input features and an output class in training data. We call the unwanted correlations "data biases," and the visual features causing data biases "bias factors." It is challenging to identify and mitigate biases automatically without human intervention. Therefore, we conducted a design study to find a human-in-the-loop solution. First, we identified user tasks that capture the bias mitigation process for image classification models with three experts. Then, to support the tasks, we developed a visual analytics system called DASH that allows users to visually identify bias factors, to iteratively generate synthetic images using a state-of-the-art image-to-image translation model, and to supervise the model training process for improving the classification accuracy. Our quantitative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.