# Exploiting Unlabeled Data in CNNs by Self-supervised Learning to Rank

**Authors:** Xialei Liu, Joost van de Weijer, Andrew D. Bagdanov

arXiv: 1902.06285 · 2019-02-19

## TL;DR

This paper introduces a self-supervised ranking approach for CNNs that leverages unlabeled data to improve regression tasks like image quality assessment and crowd counting, reducing labeling effort and enhancing performance.

## Contribution

It proposes a novel ranking-based self-supervised learning framework with an efficient backpropagation method for Siamese networks, applicable to regression problems with unlabeled data.

## Key findings

- Achieves state-of-the-art results on IQA and crowd counting.
- Demonstrates effective automatic generation of ranked data from unlabeled images.
- Shows that network uncertainty on the proxy task guides active learning, reducing labeling effort by up to 50%.

## Abstract

For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50%.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.06285/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1902.06285/full.md

## References

70 references — full list in the complete paper: https://tomesphere.com/paper/1902.06285/full.md

---
Source: https://tomesphere.com/paper/1902.06285