Subjective Assessments of Legibility in Ancient Manuscript Images -- The SALAMI Dataset
Simon Brenner, Robert Sablatnig

TL;DR
This paper introduces SALAMI, a new dataset with expert-annotated legibility scores for ancient manuscript images, providing a valuable ground truth for evaluating digital restoration methods.
Contribution
It presents the first dataset of subjective legibility assessments with validated reliability, enabling quantitative evaluation of digital text restoration techniques.
Findings
High intra- and inter-rater agreement in legibility scores
Scores are primarily influenced by image regions, not observer variability
The dataset supports development of quantitative metrics for digital heritage restoration
Abstract
The research field concerned with the digital restoration of degraded written heritage lacks a quantitative metric for evaluating its results, which prevents the comparison of relevant methods on large datasets. Thus, we introduce a novel dataset of Subjective Assessments of Legibility in Ancient Manuscript Images (SALAMI) to serve as a ground truth for the development of quantitative evaluation metrics in the field of digital text restoration. This dataset consists of 250 images of 50 manuscript regions with corresponding spatial maps of mean legibility and uncertainty, which are based on a study conducted with 20 experts of philology and paleography. As this study is the first of its kind, the validity and reliability of its design and the results obtained are motivated statistically: we report a high intra- and inter-rater agreement and show that the bulk of variation in the scores…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
