Learning to Decipher from Pixels -- A Case Study of Copiale

Lei Kang; Giuseppe De Gregorio; Raphaela Heil; Alicia Forn\'es; Be\'ata Megyesi

arXiv:2604.23683·cs.CV·April 28, 2026

Learning to Decipher from Pixels -- A Case Study of Copiale

Lei Kang, Giuseppe De Gregorio, Raphaela Heil, Alicia Forn\'es, Be\'ata Megyesi

PDF

1 Repo 1 Datasets

TL;DR

This paper presents an end-to-end, transcription-free method for deciphering historical encrypted manuscripts directly from handwritten cipher images to plaintext, demonstrated on the Copiale cipher.

Contribution

It introduces the first dataset pairing cipher images with plaintext and shows that pretraining on handwriting data enhances decipherment accuracy.

Findings

01

Pretraining on handwriting data improves accuracy.

02

Transcription-free approach is feasible and effective.

03

Scalable alternative to traditional decipherment pipelines.

Abstract

Historical encrypted manuscripts require both paleographic interpretation of cipher symbols and cryptanalytic recovery of plaintext. Most existing computational workflows rely on a transcription-first paradigm, in which handwritten symbols are transcribed prior to decipherment. This intermediate step is labor-intensive, error-prone, and not always aligned with the goal of direct plaintext recovery. We propose an end-to-end, transcription-free approach that directly maps handwritten cipher images to plaintext. Using the Copiale cipher as a case study, we introduce the first text-line-level dataset pairing cipher images with German plaintext. We show that pretraining on generic handwriting data followed by cipher-specific fine-tuning substantially improves decipherment accuracy. Our results demonstrate that transcription-free image-to-plaintext decipherment is both feasible and effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leitro/Decipher-from-Pixels-Copiale
github

Datasets

leitro/Copiale_Lines
dataset· 47 dl
47 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.