Merlin L48 Spectrogram Dataset
Aaron Sun, Subhransu Maji, Grant Van Horn

TL;DR
This paper introduces the L48 dataset, a real-world, fine-grained bird sound dataset for single-positive multi-label learning, highlighting the limitations of synthetic datasets and benchmarking existing methods on this challenging new benchmark.
Contribution
The paper presents the L48 dataset, a realistic, fine-grained bird sound dataset for SPML, and evaluates existing methods, revealing their weaknesses and the need for more challenging benchmarks.
Findings
Significant performance gaps between synthetic and real-world datasets.
Existing SPML methods underperform on the L48 dataset.
Benchmarking shows the necessity for more robust SPML approaches.
Abstract
In the single-positive multi-label (SPML) setting, each image in a dataset is labeled with the presence of a single class, while the true presence of other classes remains unknown. The challenge is to narrow the performance gap between this partially-labeled setting and fully-supervised learning, which often requires a significant annotation budget. Prior SPML methods were developed and benchmarked on synthetic datasets created by randomly sampling single positive labels from fully-annotated datasets like Pascal VOC, COCO, NUS-WIDE, and CUB200. However, this synthetic approach does not reflect real-world scenarios and fails to capture the fine-grained complexities that can lead to difficult misclassifications. In this work, we introduce the L48 dataset, a fine-grained, real-world multi-label dataset derived from recordings of bird sounds. L48 provides a natural SPML setting with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning
