The Rest is Silence: Leveraging Unseen Species Models for Computational Musicology
Fabian C. Moss, Jan Haji\v{c} jr., Adrian Nachtwey, and Laurent Pugin

TL;DR
This paper introduces Unseen Species Models from ecology to address incomplete musicological datasets, enabling quantitative estimates of missing data and coverage in various musicological contexts.
Contribution
It applies Unseen Species Models to musicology for the first time, providing a novel quantitative approach to estimate unseen or missing musical data.
Findings
Estimated the number of missing composers in RISM
Assessed the coverage of medieval Gregorian chant sources
Predicted differences in music print editions
Abstract
For many decades, musicologists have engaged in creating large databases serving different purposes for musicological research and scholarship. With the rise of fields like music information retrieval and digital musicology, there is now a constant and growing influx of musicologically relevant datasets and corpora. In historical or observational settings, however, these datasets are necessarily incomplete, and the true extent of a collection of interest remains unknown -- silent. Here, we apply, for the first time, so-called Unseen Species models (USMs) from ecology to areas of musicological activity. After introducing the models formally, we show in four case studies how USMs can be applied to musicological data to address quantitative questions like: How many composers are we missing in RISM? What percentage of medieval sources of Gregorian chant have we already cataloged? How many…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
