# Pushing One Pair of Labels Apart Each Time in Multi-Label Learning: From   Single Positive to Full Labels

**Authors:** Xiang Li, Xinrui Wang, Songcan Chen

arXiv: 2302.14695 · 2024-06-11

## TL;DR

This paper introduces a novel loss function for multi-label learning that improves robustness to noisy labels by selectively pushing label pairs apart, applicable to both single positive and full label scenarios.

## Contribution

The authors propose a new loss that mitigates noisy label effects in multi-label learning by focusing on one label pair at a time and leveraging label matrix high-rankness.

## Key findings

- The proposed loss outperforms existing SPMLL methods on CUB dataset.
- High-rankness of label matrix helps reduce performance drop in SPMLL.
- The method is effective without additional regularization or label correction.

## Abstract

In Multi-Label Learning (MLL), it is extremely challenging to accurately annotate every appearing object due to expensive costs and limited knowledge. When facing such a challenge, a more practical and cheaper alternative should be Single Positive Multi-Label Learning (SPMLL), where only one positive label needs to be provided per sample. Existing SPMLL methods usually assume unknown labels as negatives, which inevitably introduces false negatives as noisy labels. More seriously, Binary Cross Entropy (BCE) loss is often used for training, which is notoriously not robust to noisy labels. To mitigate this issue, we customize an objective function for SPMLL by pushing only one pair of labels apart each time to prevent the domination of negative labels, which is the main culprit of fitting noisy labels in SPMLL. To further combat such noisy labels, we explore the high-rankness of label matrix, which can also push apart different labels. By directly extending from SPMLL to MLL with full labels, a unified loss applicable to both settings is derived. Experiments on real datasets demonstrate that the proposed loss not only performs more robustly to noisy labels for SPMLL but also works well for full labels. Besides, we empirically discover that high-rankness can mitigate the dramatic performance drop in SPMLL. Most surprisingly, even without any regularization or fine-tuned label correction, only adopting our loss defeats state-of-the-art SPMLL methods on CUB, a dataset that severely lacks labels.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14695/full.md

## Figures

67 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14695/full.md

## References

53 references — full list in the complete paper: https://tomesphere.com/paper/2302.14695/full.md

---
Source: https://tomesphere.com/paper/2302.14695