EduceLab-Scrolls: Verifiable Recovery of Text from Herculaneum Papyri using X-ray CT
Stephen Parsons, C. Seth Parker, Christy Chapman, Mami Hayashida, W., Brent Seales

TL;DR
This paper introduces a comprehensive software pipeline and dataset for recovering hidden texts from Herculaneum papyri using X-ray CT, combining machine learning with geometric analysis to reveal previously inaccessible ancient writings.
Contribution
It presents the first aligned dataset linking spectral photography and X-ray CT images, enabling supervised detection of invisible ink and revealing new texts from ancient scrolls.
Findings
Successfully revealed hidden texts on ancient scrolls.
Created the largest heritage dataset with aligned spectral and CT images.
Enabled discovery of previously unknown texts from Herculaneum papyri.
Abstract
We present a complete software pipeline for revealing the hidden texts of the Herculaneum papyri using X-ray CT images. This enhanced virtual unwrapping pipeline combines machine learning with a novel geometric framework linking 3D and 2D images. We also present EduceLab-Scrolls, a comprehensive open dataset representing two decades of research effort on this problem. EduceLab-Scrolls contains a set of volumetric X-ray CT images of both small fragments and intact, rolled scrolls. The dataset also contains 2D image labels that are used in the supervised training of an ink detection model. Labeling is enabled by aligning spectral photography of scroll fragments with X-ray CT images of the same fragments, thus creating a machine-learnable mapping between image spaces and modalities. This alignment permits supervised learning for the detection of "invisible" carbon ink in X-ray CT, a task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Digital Media Forensic Detection
