LARP: Language Audio Relational Pre-training for Cold-Start Playlist   Continuation

Rebecca Salganik; Xiaohao Liu; Yunshan Ma; Jian Kang; and Tat-Seng; Chua

arXiv:2406.14333·cs.IR·June 21, 2024

LARP: Language Audio Relational Pre-training for Cold-Start Playlist Continuation

Rebecca Salganik, Xiaohao Liu, Yunshan Ma, Jian Kang, and Tat-Seng, Chua

PDF

Open Access 1 Repo

TL;DR

LARP is a multi-modal contrastive learning framework designed to improve cold-start playlist continuation by integrating language, audio, and relational signals into content representations, outperforming existing models.

Contribution

The paper introduces LARP, a novel three-stage contrastive learning model that effectively incorporates multi-modal and relational signals for cold-start playlist continuation.

Findings

01

LARP outperforms uni-modal and multi-modal baselines on public datasets.

02

The three-stage contrastive framework enhances content representations for cold-start scenarios.

03

Code and datasets are publicly available for reproducibility.

Abstract

As online music consumption increasingly shifts towards playlist-based listening, the task of playlist continuation, in which an algorithm suggests songs to extend a playlist in a personalized and musically cohesive manner, has become vital to the success of music streaming. Currently, many existing playlist continuation approaches rely on collaborative filtering methods to perform recommendation. However, such methods will struggle to recommend songs that lack interaction data, an issue known as the cold-start problem. Current approaches to this challenge design complex mechanisms for extracting relational signals from sparse collaborative data and integrating them into content representations. However, these approaches leave content representation learning out of scope and utilize frozen, pre-trained content models that may not be aligned with the distribution or format of a specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rsalganik1123/larp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Phonetics and Phonology Research

MethodsContrastive Learning