Representation Learning for the Automatic Indexing of Sound Effects   Libraries

Alison B. Ma; Alexander Lerch

arXiv:2208.09096·cs.SD·August 22, 2022

Representation Learning for the Automatic Indexing of Sound Effects Libraries

Alison B. Ma, Alexander Lerch

PDF

Open Access 1 Repo

TL;DR

This paper proposes a dataset-independent, taxonomy-agnostic representation learning method for sound effects libraries, improving search and categorization despite inconsistent metadata and limited data, outperforming existing methods like OpenL3.

Contribution

It introduces a novel, generalized embedding approach for sound effects that overcomes dataset limitations and taxonomy issues, enhancing sound library management.

Findings

01

Dataset-independent embeddings outperform OpenL3

02

Metric learning improves representation quality

03

Cross-dataset training enhances generalization

Abstract

Labeling and maintaining a commercial sound effects library is a time-consuming task exacerbated by databases that continually grow in size and undergo taxonomy updates. Moreover, sound search and taxonomy creation are complicated by non-uniform metadata, an unrelenting problem even with the introduction of a new industry standard, the Universal Category System. To address these problems and overcome dataset-dependent limitations that inhibit the successful training of deep learning models, we pursue representation learning to train generalized embeddings that can be used for a wide variety of sound effects libraries and are a taxonomy-agnostic representation of sound. We show that a task-specific but dataset-independent representation can successfully address data issues such as class imbalance, inconsistent class labels, and insufficient dataset size, outperforming established…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alisonbma/aisfx
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Diverse Musicological Studies · Music Technology and Sound Studies

MethodsLib