Self-Supervised Pretraining and Controlled Augmentation Improve Rare Wildlife Recognition in UAV Images
Xiaochen Zheng, Benjamin Kellenberger, Rui Gong, Irena, Hajnsek, Devis Tuia

TL;DR
This paper introduces a self-supervised pretraining approach using contrastive learning and controlled augmentations to significantly improve rare wildlife recognition in UAV images, reducing annotation needs and boosting model recall.
Contribution
It demonstrates that combining MoCo, CLD, and geometric augmentations in self-supervised pretraining outperforms traditional ImageNet pretraining for wildlife detection in aerial imagery.
Findings
Outperforms ImageNet pretraining by a large margin
Maintains high accuracy with only 10% of training animals
Doubles recall at similar precision with minimal annotations
Abstract
Automated animal censuses with aerial imagery are a vital ingredient towards wildlife conservation. Recent models are generally based on deep learning and thus require vast amounts of training data. Due to their scarcity and minuscule size, annotating animals in aerial imagery is a highly tedious process. In this project, we present a methodology to reduce the amount of required training data by resorting to self-supervised pretraining. In detail, we examine a combination of recent contrastive learning methodologies like Momentum Contrast (MoCo) and Cross-Level Instance-Group Discrimination (CLD) to condition our model on the aerial images without the requirement for labels. We show that a combination of MoCo, CLD, and geometric augmentations outperforms conventional models pre-trained on ImageNet by a large margin. Crucially, our method still yields favorable results even if we reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Wildlife Ecology and Conservation · Domain Adaptation and Few-Shot Learning
MethodsContrastive Learning · Batch Normalization · InfoNCE · Momentum Contrast
