# HasLoss: a novel Hassanat distance-based loss functions for binary classification

**Authors:** Ahmad S. Tarawneh

PMC · DOI: 10.3389/frai.2025.1690830 · Frontiers in Artificial Intelligence · 2026-02-10

## TL;DR

This paper introduces Hassanat distance-based loss functions for binary classification, showing they offer robustness and convergence guarantees with competitive performance.

## Contribution

A novel theoretical framework for distance-based loss functions using Hassanat distance with bounded gradients and empirical validation.

## Key findings

- Hassanat losses show bounded gradients and robustness to outliers with finite Lipschitz constants.
- Proposed variants outperformed or matched BCE, Focal Loss, MSE, and L1 in metrics like precision, recall, and AUC.
- Some variants showed larger practical effect sizes than popular loss functions like BCE.

## Abstract

Loss functions play a critical role in machine learning, particularly in training neural networks for classification tasks. In this work, we establish a theoretical framework for distance-based loss functions by adapting the Hassanat distance for binary classification.

Through gradient analysis, we prove that Hassanat losses exhibit bounded gradients with finite Lipschitz constants, providing convergence guarantees and robustness to outliers. We formulate six variants with different error sensitivities and validate these theoretical properties empirically. Their effectiveness is evaluated on synthetic datasets and nine real-world datasets, ranging from a few hundred to nearly 48,000 samples, under controlled experimental conditions. A comprehensive comparison is conducted against widely used loss functions, including Binary Cross-Entropy (BCE), Focal Loss, Mean Squared Error (MSE), and L1 Loss.

Results show that the proposed Hassanat-based losses achieve competitive performance across evaluation metrics, with comparable or slightly improved results in calibration, convergence speed (in terms of epochs), precision, recall, F1-score, and AUC on several datasets, while exhibiting notable robustness to outliers and noise. The estimated Floating Point Operations (FLOPs) shows that the wall-clock time difference is due to implementation gap, not algorithmic. Importantly, Cohen's d effect size and confidence interval analyses shows that some of the proposed variants introduce a larger practical effect size than popular loss functions such as BCE.

This work establishes both theoretical foundations and empirical validation for distance-based loss functions. The bounded gradient framework with finite Lipschitz constants provides principled optimization guarantees while explaining observed robustness and convergence behavior. This foundation enables systematic development of robust loss functions tailored to specific application requirements.

## Full-text entities

- **Diseases:** cancer (MESH:D009369), N (MESH:C536108), Convexity (MESH:D005413), H (MESH:D000848), Heart Disease (MESH:D006331)
- **Chemicals:** PolyLoss (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12929465/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12929465/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/PMC12929465/full.md

---
Source: https://tomesphere.com/paper/PMC12929465