# UnbiasedNets: A Dataset Diversification Framework for Robustness Bias   Alleviation in Neural Networks

**Authors:** Mahum Naseer, Bharath Srinivas Prabakaran, Osman Hasan, Muhammad, Shafique

arXiv: 2302.12538 · 2023-03-14

## TL;DR

This paper introduces UnbiasedNets, a novel framework that uses clustering and noise tolerance to diversify training datasets, thereby reducing robustness bias caused by dataset imbalance in neural networks.

## Contribution

UnbiasedNets is the first framework specifically designed to address robustness bias in neural networks by diversifying datasets through clustering and noise tolerance techniques.

## Key findings

- UnbiasedNets effectively balances datasets, reducing robustness bias in neural networks.
- The framework outperforms existing data balancing tools on real-world datasets.
- Significant bias reduction observed in both binary and multi-label classifiers.

## Abstract

Performance of trained neural network (NN) models, in terms of testing accuracy, has improved remarkably over the past several years, especially with the advent of deep learning. However, even the most accurate NNs can be biased toward a specific output classification due to the inherent bias in the available training datasets, which may propagate to the real-world implementations. This paper deals with the robustness bias, i.e., the bias exhibited by the trained NN by having a significantly large robustness to noise for a certain output class, as compared to the remaining output classes. The bias is shown to result from imbalanced datasets, i.e., the datasets where all output classes are not equally represented. Towards this, we propose the UnbiasedNets framework, which leverages K-means clustering and the NN's noise tolerance to diversify the given training dataset, even from relatively smaller datasets. This generates balanced datasets and reduces the bias within the datasets themselves. To the best of our knowledge, this is the first framework catering to the robustness bias problem in NNs. We use real-world datasets to demonstrate the efficacy of the UnbiasedNets for data diversification, in case of both binary and multi-label classifiers. The results are compared to well-known tools aimed at generating balanced datasets, and illustrate how existing works have limited success while addressing the robustness bias. In contrast, UnbiasedNets provides a notable improvement over existing works, while even reducing the robustness bias significantly in some cases, as observed by comparing the NNs trained on the diversified and original datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.12538/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/2302.12538/full.md

## References

59 references — full list in the complete paper: https://tomesphere.com/paper/2302.12538/full.md

---
Source: https://tomesphere.com/paper/2302.12538