TL;DR
This paper demonstrates that self-supervised pre-training on in-domain benthic imagery improves hierarchical multi-label classification performance, especially with missing annotations, and establishes a benchmark for underwater image annotation.
Contribution
It introduces a method for hierarchical multi-label classification with missing data using self-supervised learning on benthic imagery, outperforming ImageNet pre-training.
Findings
Self-supervised in-domain pre-training outperforms ImageNet pre-training.
Models achieve deeper, more accurate classification with in-domain pre-training.
The approach handles heterogeneous data with missing annotations effectively.
Abstract
In this work, we apply state-of-the-art self-supervised learning techniques on a large dataset of seafloor imagery, \textit{BenthicNet}, and study their performance for a complex hierarchical multi-label (HML) classification downstream task. In particular, we demonstrate the capacity to conduct HML training in scenarios where there exist multiple levels of missing annotation information, an important scenario for handling heterogeneous real-world data collected by multiple research groups with differing data collection protocols. We find that, when using smaller one-hot image label datasets typical of local or regional scale benthic science projects, models pre-trained with self-supervision on a larger collection of in-domain benthic data outperform models pre-trained on ImageNet. In the HML setting, we find the model can attain a deeper and more precise classification if it is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
