Subjective Assessments of Legibility in Ancient Manuscript Images -- The   SALAMI Dataset

Simon Brenner; Robert Sablatnig

arXiv:2102.09961·cs.CV·February 22, 2021

Subjective Assessments of Legibility in Ancient Manuscript Images -- The SALAMI Dataset

Simon Brenner, Robert Sablatnig

PDF

TL;DR

This paper introduces SALAMI, a new dataset with expert-annotated legibility scores for ancient manuscript images, providing a valuable ground truth for evaluating digital restoration methods.

Contribution

It presents the first dataset of subjective legibility assessments with validated reliability, enabling quantitative evaluation of digital text restoration techniques.

Findings

01

High intra- and inter-rater agreement in legibility scores

02

Scores are primarily influenced by image regions, not observer variability

03

The dataset supports development of quantitative metrics for digital heritage restoration

Abstract

The research field concerned with the digital restoration of degraded written heritage lacks a quantitative metric for evaluating its results, which prevents the comparison of relevant methods on large datasets. Thus, we introduce a novel dataset of Subjective Assessments of Legibility in Ancient Manuscript Images (SALAMI) to serve as a ground truth for the development of quantitative evaluation metrics in the field of digital text restoration. This dataset consists of 250 images of 50 manuscript regions with corresponding spatial maps of mean legibility and uncertainty, which are based on a study conducted with 20 experts of philology and paleography. As this study is the first of its kind, the validity and reliability of its design and the results obtained are motivated statistically: we report a high intra- and inter-rater agreement and show that the bulk of variation in the scores…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.