# Cross-Modal Music Retrieval and Applications: An Overview of Key   Methodologies

**Authors:** Meinard M\"uller, Andreas Arzt, Stefan Balke, Matthias Dorfer, and Gerhard Widmer

arXiv: 1902.04397 · 2019-02-13

## TL;DR

This paper provides an overview of cross-modal music retrieval methods, addressing the challenge of connecting diverse music data types like audio, images, and video for improved exploration of large music collections.

## Contribution

It summarizes key methodologies in cross-modal music retrieval, highlighting recent advances and applications beyond traditional audio identification.

## Key findings

- Survey of existing cross-modal retrieval techniques
- Identification of challenges in multimodal music data integration
- Discussion of practical applications and future directions

## Abstract

There has been a rapid growth of digitally available music data, including audio recordings, digitized images of sheet music, album covers and liner notes, and video clips. This huge amount of data calls for retrieval strategies that allow users to explore large music collections in a convenient way. More precisely, there is a need for cross-modal retrieval algorithms that, given a query in one modality (e.g., a short audio excerpt), find corresponding information and entities in other modalities (e.g., the name of the piece and the sheet music). This goes beyond exact audio identification and subsequent retrieval of metainformation as performed by commercial applications like Shazam [1].

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.04397/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1902.04397/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1902.04397/full.md

---
Source: https://tomesphere.com/paper/1902.04397