# Multiscale CNN based Deep Metric Learning for Bioacoustic   Classification: Overcoming Training Data Scarcity Using Dynamic Triplet Loss

**Authors:** Anshul Thakur, Daksh Thapar, Padmanabhan Rajan, Aditya Nigam

arXiv: 1903.10713 · 2019-09-04

## TL;DR

This paper introduces a multiscale CNN with dynamic triplet loss for bioacoustic classification, effectively addressing low training data challenges by capturing diverse features and enhancing class separation.

## Contribution

It presents a novel multiscale CNN architecture combined with dynamic triplet loss, improving bioacoustic classification performance under limited data conditions.

## Key findings

- Outperforms existing bioacoustic classification methods
- Triplet loss is more effective than cross-entropy in low-data scenarios
- Multiscale filters capture both fine and global acoustic features

## Abstract

This paper proposes multiscale convolutional neural network (CNN)-based deep metric learning for bioacoustic classification, under low training data conditions. The proposed CNN is characterized by the utilization of four different filter sizes at each level to analyze input feature maps. This multiscale nature helps in describing different bioacoustic events effectively: smaller filters help in learning the finer details of bioacoustic events, whereas, larger filters help in analyzing a larger context leading to global details. A dynamic triplet loss is employed in the proposed CNN architecture to learn a transformation from the input space to the embedding space, where classification is performed. The triplet loss helps in learning this transformation by analyzing three examples, referred to as triplets, at a time where intra-class distance is minimized while maximizing the inter-class separation by a dynamically increasing margin. The number of possible triplets increases cubically with the dataset size, making triplet loss more suitable than the softmax cross-entropy loss in low training data conditions. Experiments on three different publicly available datasets show that the proposed framework performs better than existing bioacoustic classification frameworks. Experimental results also confirm the superiority of the triplet loss over the cross-entropy loss in low training data conditions

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.10713/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1903.10713/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/1903.10713/full.md

---
Source: https://tomesphere.com/paper/1903.10713