# Semi-Supervised Learning with Scarce Annotations

**Authors:** Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi,, Andrew Zisserman

arXiv: 1905.08845 · 2020-04-23

## TL;DR

This paper proposes a semi-supervised learning method that effectively leverages transfer learning and self-supervision to train competitive multi-class classifiers with very limited labeled data, reducing overfitting and balancing labeled and unlabeled data.

## Contribution

It introduces a novel SSL algorithm that alternates between fitting labeled and unlabeled data using pre-trained representations, improving performance with scarce annotations.

## Key findings

- Successful training with as few as 10 labeled points per class
- Self-supervised pre-training enhances SSL performance
- Algorithm outperforms existing methods on benchmarks

## Abstract

While semi-supervised learning (SSL) algorithms provide an efficient way to make use of both labelled and unlabelled data, they generally struggle when the number of annotated samples is very small. In this work, we consider the problem of SSL multi-class classification with very few labelled instances. We introduce two key ideas. The first is a simple but effective one: we leverage the power of transfer learning among different tasks and self-supervision to initialize a good representation of the data without making use of any label. The second idea is a new algorithm for SSL that can exploit well such a pre-trained representation.   The algorithm works by alternating two phases, one fitting the labelled points and one fitting the unlabelled ones, with carefully-controlled information flow between them. The benefits are greatly reducing overfitting of the labelled data and avoiding issue with balancing labelled and unlabelled losses during training. We show empirically that this method can successfully train competitive models with as few as 10 labelled data points per class. More in general, we show that the idea of bootstrapping features using self-supervised learning always improves SSL on standard benchmarks. We show that our algorithm works increasingly well compared to other methods when refining from other tasks or datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.08845/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1905.08845/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/1905.08845/full.md

---
Source: https://tomesphere.com/paper/1905.08845