# A dataset of insect sounds from 459 species for bioacoustic machine learning

**Authors:** Marius Faiß, Burooj Ghani, Dan Stowell

PMC · DOI: 10.1038/s41597-026-07123-4 · 2026-03-27

## TL;DR

This paper introduces a large insect sound dataset to improve machine learning models for identifying insect species through audio.

## Contribution

The paper introduces InsectSet459, the first large-scale insect sound dataset suitable for developing deep-learning methods.

## Key findings

- The dataset contains 26,298 audio files from 459 insect species.
- Benchmarking shows good but improvable performance in acoustic insect classification.
- The dataset supports development of audio methods for variable frequencies and sample rates.

## Abstract

Automatic recognition of insect sound could help us understand changing biodiversity trends around the world—but insect sounds are challenging to recognize even for deep learning, due to the broad frequency ranges and limited amount of training data. We present a new dataset comprised of 26298 audio files (226.6 hours), from 459 species of Orthoptera (310 species) and Cicadidae (149 species). InsectSet459 is the first large-scale dataset of insect sound that is easily applicable for developing novel deep-learning methods. Its recordings were made with a variety of audio recorders using varying sample rates to capture the extremely broad range of frequencies that insects produce. We benchmark performance with two state-of-the-art deep learning classifiers, demonstrating good performance but also significant room for improvement in acoustic insect classification. This dataset can serve as a realistic test case for implementing insect monitoring workflows, and as a challenging basis for the development of audio representation methods that can handle highly variable frequencies and/or sample rates.

## Linked entities

- **Species:** Orthoptera (taxon 6993), Cicadidae (taxon 7033)

## Full-text entities

- **Chemicals:** IS66 (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Caelifera (grasshoppers, groundhoppers & pygmy mole crickets, suborder) [taxon 7001]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13035933/full.md

---
Source: https://tomesphere.com/paper/PMC13035933