ChoralSynth: Synthetic Dataset of Choral Singing
Jyoti Narang, Viviana De La Vega, Xavier Lizarraga, Oscar Mayor,, Hector Parra, Jordi Janer, Xavier Serra

TL;DR
ChoralSynth introduces a synthetic choral singing dataset created using synthesizers and public domain scores, aiming to facilitate Music Information Retrieval research by overcoming data scarcity.
Contribution
This work presents a novel methodology for generating high-quality synthetic choral datasets using synthesizers and public domain scores, filling a critical gap in MIR research.
Findings
Dataset includes diverse choral renditions with metadata
Methodology enables scalable and high-quality data creation
Supports new research in singing voice analysis
Abstract
Choral singing, a widely practiced form of ensemble singing, lacks comprehensive datasets in the realm of Music Information Retrieval (MIR) research, due to challenges arising from the requirement to curate multitrack recordings. To address this, we devised a novel methodology, leveraging state-of-the-art synthesizers to create and curate quality renditions. The scores were sourced from Choral Public Domain Library(CPDL). This work is done in collaboration with a diverse team of musicians, software engineers and researchers. The resulting dataset, complete with its associated metadata, and methodology is released as part of this work, opening up new avenues for exploration and advancement in the field of singing voice research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Diverse Musicological Studies
