# A Case Study of Deep-Learned Activations via Hand-Crafted Audio Features

**Authors:** Olga Slizovskaia, Emilia G\'omez, Gloria Haro

arXiv: 1907.01813 · 2019-07-04

## TL;DR

This paper investigates the explainability of CNNs in music audio recognition by comparing learned activations with traditional hand-crafted audio features, revealing correlations between neural responses and classical audio descriptors.

## Contribution

It introduces a method to measure similarity between CNN activation maps and traditional audio features, enhancing understanding of neural representations in music information retrieval.

## Key findings

- Shallow layer activations correlate with harmonic and percussive features.
- Deep layer activations relate to chromagrams, loudness, and onset rate.
- Some neurons explicitly correspond to classical audio features.

## Abstract

The explainability of Convolutional Neural Networks (CNNs) is a particularly challenging task in all areas of application, and it is notably under-researched in music and audio domain. In this paper, we approach explainability by exploiting the knowledge we have on hand-crafted audio features. Our study focuses on a well-defined MIR task, the recognition of musical instruments from user-generated music recordings. We compute the similarity between a set of traditional audio features and representations learned by CNNs. We also propose a technique for measuring the similarity between activation maps and audio features which typically presented in the form of a matrix, such as chromagrams or spectrograms. We observe that some neurons' activations correspond to well-known classical audio features. In particular, for shallow layers, we found similarities between activations and harmonic and percussive components of the spectrum. For deeper layers, we compare chromagrams with high-level activation maps as well as loudness and onset rate with deep-learned embeddings.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.01813/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1907.01813/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1907.01813/full.md

---
Source: https://tomesphere.com/paper/1907.01813