Audio-based Distributional Semantic Models for Music Auto-tagging and   Similarity Measurement

Giannis Karamanolakis; Elias Iosif; Athanasia Zlatintsi; Aggelos; Pikrakis; Alexandros Potamianos

arXiv:1612.08391·cs.IR·December 28, 2016

Audio-based Distributional Semantic Models for Music Auto-tagging and Similarity Measurement

Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos, Pikrakis, Alexandros Potamianos

PDF

Open Access

TL;DR

This paper introduces Audio-based Distributional Semantic Models that jointly embed audio and semantic information for improved music auto-tagging and similarity measurement, outperforming existing methods.

Contribution

It presents novel joint acoustic-semantic representations for music, enhancing tag prediction and similarity tasks with superior performance.

Findings

01

Outperforms state-of-the-art in music similarity measurement

02

Produces high-quality tags for audio clips

03

Demonstrates effective joint acoustic-semantic embeddings

Abstract

The recent development of Audio-based Distributional Semantic Models (ADSMs) enables the computation of audio and lexical vector representations in a joint acoustic-semantic space. In this work, these joint representations are applied to the problem of automatic tag generation. The predicted tags together with their corresponding acoustic representation are exploited for the construction of acoustic-semantic clip embeddings. The proposed algorithms are evaluated on the task of similarity measurement between music clips. Acoustic-semantic models are shown to outperform the state-of-the-art for this task and produce high quality tags for audio/music clips.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech Recognition and Synthesis