# Radial-Based Undersampling for Imbalanced Data Classification

**Authors:** Micha{\l} Koziarski

arXiv: 1906.00452 · 2021-04-20

## TL;DR

This paper introduces Radial-Based Undersampling, a new method to address class imbalance in machine learning, especially effective on challenging datasets with outliers and small disjuncts.

## Contribution

It adapts the mutual class potential concept from oversampling to undersampling, reducing computational complexity and improving performance on difficult datasets.

## Key findings

- Significantly reduced time complexity of the undersampling algorithm.
- Improved classification performance on challenging datasets.
- Effective handling of outliers and small disjuncts.

## Abstract

Data imbalance remains one of the most widespread problems affecting contemporary machine learning. The negative effect data imbalance can have on the traditional learning algorithms is most severe in combination with other dataset difficulty factors, such as small disjuncts, presence of outliers and insufficient number of training observations. Aforementioned difficulty factors can also limit the applicability of some of the methods of dealing with data imbalance, in particular the neighborhood-based oversampling algorithms based on SMOTE. Radial-Based Oversampling (RBO) was previously proposed to mitigate some of the limitations of the neighborhood-based methods. In this paper we examine the possibility of utilizing the concept of mutual class potential, used to guide the oversampling process in RBO, in the undersampling procedure. Conducted computational complexity analysis indicates a significantly reduced time complexity of the proposed Radial-Based Undersampling algorithm, and the results of the performed experimental study indicate its usefulness, especially on difficult datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.00452/full.md

## Figures

30 figures with captions in the complete paper: https://tomesphere.com/paper/1906.00452/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/1906.00452/full.md

---
Source: https://tomesphere.com/paper/1906.00452