Utilizing synthetic training data for the supervised classification of rat ultrasonic vocalizations
K. Jack Scott, Lucinda J. Speers, David K. Bilkey

TL;DR
This study evaluates the use of synthetic USV data to enhance CNN-based classification of rat ultrasonic vocalizations, achieving near-human accuracy and reducing manual labeling effort.
Contribution
It demonstrates that synthetic training data can significantly improve CNN performance in USV classification, making automated analysis more practical.
Findings
VocalMat outperforms DeepSqueak in USV detection and classification.
Synthetic data augmentation improves CNN accuracy.
Automated classification approaches can match human performance.
Abstract
Murine rodents generate ultrasonic vocalizations (USVs) with frequencies that extend to around 120kHz. These calls are important in social behaviour, and so their analysis can provide insights into the function of vocal communication, and its dysfunction. The manual identification of USVs, and subsequent classification into different subcategories is time consuming. Although machine learning approaches for identification and classification can lead to enormous efficiency gains, the time and effort required to generate training data can be high, and the accuracy of current approaches can be problematic. Here we compare the detection and classification performance of a trained human against two convolutional neural networks (CNNs), DeepSqueak and VocalMat, on audio containing rat USVs. Furthermore, we test the effect of inserting synthetic USVs into the training data of the VocalMat CNN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeuroendocrine regulation and behavior · Animal Vocal Communication and Behavior
MethodsTest
