Positive-Unlabelled Active Learning to Curate a Dataset for Orca Resident Interpretation
Bret Nestor, Bohan Yao, Jasmine Moore, Jasper Kanes

TL;DR
This paper introduces a large, curated dataset of Southern Resident Killer Whale acoustic data, developed using a positive-unlabelled active learning approach with transformer classifiers, advancing marine mammal monitoring and conservation efforts.
Contribution
The work presents the largest SRKW acoustic dataset to date, created through a novel weakly-supervised active learning method with transformer-based classifiers, outperforming existing models.
Findings
Transformer classifiers outperform state-of-the-art on 3 of 4 datasets.
The dataset includes over 900 hours of SRKW data, surpassing previous collections.
Curated labels and data are openly available for research and conservation.
Abstract
This work presents the largest curation of Southern Resident Killer Whale (SRKW) acoustic data to date, also containing other marine mammals in their environment. We systematically search all available public archival hydrophone data within the SRKW habitat (over 30 years of audio data). The search consists of a weakly-supervised, positive-unlabelled, active learning strategy to identify all instances of marine mammals. The resulting transformer-based presence or absence classifiers outperform state-of-the-art classifiers on 3 of 4 expert-annotated datasets in terms of accuracy and energy efficiency. The fleet of WHISPER detection models range from 0.58 (0.48-0.67) AUROC with WHISPER-tiny to 0.77 (0.63-0.93) with WHISPER-large-v3. Our multiclass species classifier obtains a top-1 accuracy of 53.2\% (11 train classes, 4 test classes) and our ecotype classifier obtains a top-1 accuracy of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
