The iNaturalist Sounds Dataset
Mustafa Chasmai, Alexander Shepard, Subhransu Maji, Grant Van Horn

TL;DR
The iNaturalist Sounds Dataset (iNatSounds) is a large, publicly available collection of 230,000 audio recordings from over 5,500 species, designed to advance machine learning and ecological research in bioacoustics.
Contribution
This paper introduces the first large-scale, diverse bioacoustic dataset with weak labels, and benchmarks multiple models, demonstrating its usefulness for pretraining and ecological applications.
Findings
Models trained on iNatSounds improve species classification accuracy.
The dataset enables effective pretraining for downstream bioacoustic tasks.
Weakly labeled data can still be valuable for ecological sound analysis.
Abstract
We present the iNaturalist Sounds Dataset (iNatSounds), a collection of 230,000 audio files capturing sounds from over 5,500 species, contributed by more than 27,000 recordists worldwide. The dataset encompasses sounds from birds, mammals, insects, reptiles, and amphibians, with audio and species labels derived from observations submitted to iNaturalist, a global citizen science platform. Each recording in the dataset varies in length and includes a single species annotation. We benchmark multiple backbone architectures, comparing multiclass classification objectives with multilabel objectives. Despite weak labeling, we demonstrate that iNatSounds serves as a useful pretraining resource by benchmarking it on strongly labeled downstream evaluation datasets. The dataset is available as a single, freely accessible archive, promoting accessibility and research in this important domain. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies
