Automatic recognition of element classes and boundaries in the birdsong with variable sequences
Takuya Koumura, Kazuo Okanoya

TL;DR
This paper presents a hybrid deep neural network and hidden Markov model approach for automatic recognition of birdsong, focusing on note classification, boundary detection, and sequence modeling, with a new evaluation measure for accuracy.
Contribution
It introduces a novel hybrid model combining neural networks and HMMs tailored for birdsong recognition, emphasizing boundary detection and sequence properties.
Findings
Hybrid model outperforms other methods in recognition accuracy.
New measure effectively evaluates combined note classification and boundary detection.
Method adaptable to other species with similar vocalization properties.
Abstract
Researches on sequential vocalization often require analysis of vocalizations in long continuous sounds. In such studies as developmental ones or studies across generations in which days or months of vocalizations must be analyzed, methods for automatic recognition would be strongly desired. Although methods for automatic speech recognition for application purposes have been intensively studied, blindly applying them for biological purposes may not be an optimal solution. This is because, unlike human speech recognition, analysis of sequential vocalizations often requires accurate extraction of timing information. In the present study we propose automated systems suitable for recognizing birdsong, one of the most intensively investigated sequential vocalizations, focusing on the three properties of the birdsong. First, a song is a sequence of vocal elements, called notes, which can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
