Towards End-to-End Audio-Sheet-Music Retrieval

Matthias Dorfer; Andreas Arzt; Gerhard Widmer

arXiv:1612.05070·cs.SD·December 16, 2016·5 cites

Towards End-to-End Audio-Sheet-Music Retrieval

Matthias Dorfer, Andreas Arzt, Gerhard Widmer

PDF

Open Access

TL;DR

This paper explores a novel end-to-end method for cross-modal retrieval between audio snippets and sheet music images using deep learning, enabling music content search without symbolic representations.

Contribution

It introduces a DCCA-based approach for learning correlated latent spaces for audio and sheet music retrieval without relying on symbolic music data.

Findings

01

Initial experiments show promising retrieval accuracy.

02

Method works for simple monophonic music.

03

Cross-modality retrieval is feasible without symbolic scores.

Abstract

This paper demonstrates the feasibility of learning to retrieve short snippets of sheet music (images) when given a short query excerpt of music (audio) -- and vice versa --, without any symbolic representation of music or scores. This would be highly useful in many content-based musical retrieval scenarios. Our approach is based on Deep Canonical Correlation Analysis (DCCA) and learns correlated latent spaces allowing for cross-modality retrieval in both directions. Initial experiments with relatively simple monophonic music show promising results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing