Decipherment of Historical Manuscript Images

Xusen Yin; Nada Aldarrab; Be\'ata Megyesi; Kevin Knight

arXiv:1810.04297·cs.CL·June 4, 2019

Decipherment of Historical Manuscript Images

Xusen Yin, Nada Aldarrab, Be\'ata Megyesi, Kevin Knight

PDF

Open Access 1 Repo

TL;DR

This paper presents unsupervised methods to automatically decipher historical manuscript images, enabling historians to access contents of enciphered documents from the early modern period.

Contribution

It introduces novel unsupervised models for character segmentation, clustering, and decipherment, applied to historical cipher manuscripts, with experiments on multiple cipher types.

Findings

01

Successful unsupervised decipherment of historical cipher images

02

Effective character-image clustering for ancient manuscripts

03

Demonstrated models outperform baseline approaches

Abstract

European libraries and archives are filled with enciphered manuscripts from the early modern period. These include military and diplomatic correspondence, records of secret societies, private letters, and so on. Although they are enciphered with classical cryptographic algorithms, their contents are unavailable to working historians. We therefore attack the problem of automatically converting cipher manuscript images into plaintext. We develop unsupervised models for character segmentation, character-image clustering, and decipherment of cluster sequences. We experiment with both pipelined and joint models, and we give empirical results for multiple ciphers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yinxusen/decipherment-images
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Digital Media Forensic Detection · Image Processing and 3D Reconstruction