# Imbalance-XGBoost: Leveraging Weighted and Focal Losses for Binary   Label-Imbalanced Classification with XGBoost

**Authors:** Chen Wang, Chengyuan Deng, Suzhen Wang

arXiv: 1908.01672 · 2021-08-24

## TL;DR

Imbalance-XGBoost is a Python package that enhances XGBoost with weighted and focal losses to improve binary classification in label-imbalanced datasets, demonstrating superior performance on Parkinson's disease data.

## Contribution

The paper introduces the first integrated implementation of weighted and focal losses for XGBoost, including algebraic derivations and practical usage for imbalanced classification.

## Key findings

- Achieved state-of-the-art results on Parkinson's disease dataset.
- Demonstrated the package's ease of integration into Python ML workflows.
- Showed potential for large-scale real-world applications.

## Abstract

The paper presents Imbalance-XGBoost, a Python package that combines the powerful XGBoost software with weighted and focal losses to tackle binary label-imbalanced classification tasks. Though a small-scale program in terms of size, the package is, to the best of the authors' knowledge, the first of its kind which provides an integrated implementation for the two losses on XGBoost and brings a general-purpose extension on XGBoost for label-imbalanced scenarios. In this paper, the design and usage of the package are described with exemplar code listings, and its convenience to be integrated into Python-driven Machine Learning projects is illustrated. Furthermore, as the first- and second-order derivatives of the loss functions are essential for the implementations, the algebraic derivation is discussed and it can be deemed as a separate algorithmic contribution. The performances of the algorithms implemented in the package are empirically evaluated on Parkinson's disease classification data set, and multiple state-of-the-art performances have been observed. Given the scalable nature of XGBoost, the package has great potentials to be applied to real-life binary classification tasks, which are usually of large-scale and label-imbalanced.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.01672/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1908.01672/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1908.01672/full.md

---
Source: https://tomesphere.com/paper/1908.01672