Nuremberg Letterbooks: A Multi-Transcriptional Dataset of Early 15th Century Manuscripts for Document Analysis
Martin Mayr, Julian Krenz, Katharina Neumeier, Anna Bub, Simon, B\"urcky, Nina Brolich, Klaus Herbers, Mechthild Habermann, Peter, Fleischmann, Andreas Maier, Vincent Christlein

TL;DR
The Nuremberg Letterbooks dataset offers a multi-transcriptional collection of early 15th-century manuscripts, enabling more humanities-aligned document analysis methods with multiple transcription types and metadata.
Contribution
It introduces a novel dataset with diverse transcriptions and metadata for early 15th-century manuscripts, supporting humanities-focused document analysis.
Findings
Established baseline results for handwriting recognition tasks.
Demonstrated data consistency across transcriptions.
Provided benchmarks for future research.
Abstract
Most datasets in the field of document analysis utilize highly standardized labels, which, while simplifying specific tasks, often produce outputs that are not directly applicable to humanities research. In contrast, the Nuremberg Letterbooks dataset, which comprises historical documents from the early 15th century, addresses this gap by providing multiple types of transcriptions and accompanying metadata. This approach allows for developing methods that are more closely aligned with the needs of the humanities. The dataset includes 4 books containing 1711 labeled pages written by 10 scribes. Three types of transcriptions are provided for handwritten text recognition: Basic, diplomatic, and regularized. For the latter two, versions with and without expanded abbreviations are also available. A combination of letter ID and writer ID supports writer identification due to changing writers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHistorical Geopolitical and Social Dynamics · Historical and Archaeological Studies · Digital Humanities and Scholarship
