# Stacked Convolutional and Recurrent Neural Networks for Bird Audio   Detection

**Authors:** Sharath Adavanne, Konstantinos Drossos, Emre \c{C}ak{\i}r, Tuomas, Virtanen

arXiv: 1706.02047 · 2017-06-08

## TL;DR

This paper presents a robust bird call detection method using stacked convolutional and recurrent neural networks, enhanced by data augmentation and domain adaptation techniques, achieving high accuracy on unseen data.

## Contribution

It introduces a novel test mixing domain adaptation method and evaluates the impact of different acoustic features for bird audio detection.

## Key findings

- Achieved 95.5% AUC on development data
- Achieved 88.1% AUC on unseen evaluation data
- Demonstrated effectiveness of data augmentation and feature combinations

## Abstract

This paper studies the detection of bird calls in audio segments using stacked convolutional and recurrent neural networks. Data augmentation by blocks mixing and domain adaptation using a novel method of test mixing are proposed and evaluated in regard to making the method robust to unseen data. The contributions of two kinds of acoustic features (dominant frequency and log mel-band energy) and their combinations are studied in the context of bird audio detection. Our best achieved AUC measure on five cross-validations of the development data is 95.5% and 88.1% on the unseen evaluation data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.02047/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1706.02047/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1706.02047/full.md

---
Source: https://tomesphere.com/paper/1706.02047