S2Cap: A Benchmark and a Baseline for Singing Style Captioning

Hyunjong Ok; Jaeho Lee

arXiv:2409.09866·cs.CL·August 19, 2025

S2Cap: A Benchmark and a Baseline for Singing Style Captioning

Hyunjong Ok, Jaeho Lee

PDF

Open Access

TL;DR

This paper introduces S2Cap, a comprehensive dataset for singing style captioning, and proposes a simple baseline algorithm, addressing the lack of detailed singing voice datasets for downstream tasks.

Contribution

The paper presents S2Cap, a new dataset with detailed singing voice descriptions, and a baseline algorithm for singing style captioning, filling a key gap in the field.

Findings

01

S2Cap dataset covers diverse vocal and acoustic attributes.

02

Baseline algorithm achieves initial performance on singing style captioning.

03

Dataset availability facilitates future research in singing voice analysis.

Abstract

Singing voices contain much richer information than common voices, including varied vocal and acoustic properties. However, current open-source audio-text datasets for singing voices capture only a narrow range of attributes and lack acoustic features, leading to limited utility towards downstream tasks, such as style captioning. To fill this gap, we formally define the singing style captioning task and present S2Cap, a dataset of singing voices with detailed descriptions covering diverse vocal, acoustic, and demographic characteristics. Using this dataset, we develop an efficient and straightforward baseline algorithm for singing style captioning. The dataset is available at https://zenodo.org/records/15673764.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Diverse Musicological Studies

MethodsFocus · Sparse Evolutionary Training