HORAE: an annotated dataset of books of hours
M\'elodie Boillet, Marie-Laurence Bonhomme, Dominique Stutzmann and, Christopher Kermorvant

TL;DR
This paper introduces a new annotated dataset of medieval books of hours, enabling historical research on religious and cultural evolution through detailed text and image annotations.
Contribution
The paper presents a newly created, manually annotated dataset of books of hours, including evaluation of state-of-the-art detection systems for historical document analysis.
Findings
The dataset facilitates research into medieval religious practices.
State-of-the-art systems achieve promising results on text and zone detection.
The dataset is publicly available for further research.
Abstract
We introduce in this paper a new dataset of annotated pages from books of hours, a type of handwritten prayer books owned and used by rich lay people in the late middle ages. The dataset was created for conducting historical research on the evolution of the religious mindset in Europe at this period since the book of hours represent one of the major sources of information thanks both to their rich illustrations and the different types of religious sources they contain. We first describe how the corpus was collected and manually annotated then present the evaluation of a state-of-the-art system for text line detection and for zone detection and typing. The corpus is freely available for research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
