TL;DR
APEX is a large-scale multi-task learning framework that predicts popularity and aesthetic quality of AI-generated music, demonstrating strong generalization across unseen music systems.
Contribution
It introduces the first multi-task model trained on over 211k songs to jointly predict popularity and aesthetic quality from frozen audio embeddings.
Findings
Aesthetic features improve preference prediction across unseen music systems.
The model generalizes well to out-of-distribution generative architectures.
Joint prediction of popularity and aesthetic quality captures complementary music aspects.
Abstract
Music popularity prediction has attracted growing research interest, with relevance to artists, platforms, and recommendation systems. However, the explosive rise of AI-generated music platforms has created an entirely new and largely unexplored landscape, where a surge of songs is produced and consumed daily without the traditional markers of artist reputation or label backing. Key, yet unexplored in this pursuit is aesthetic quality. We propose APEX, the first large-scale multi-task learning framework for AI-generated music, trained on over 211k songs (10k hours of audio) from Suno and Udio, that jointly predicts engagement-based popularity signals - streams and likes scores - alongside five perceptual aesthetic quality dimensions from frozen audio embeddings extracted from MERT, a self-supervised music understanding model. Aesthetic quality and popularity capture complementary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
