Solving Historical Dictionary Codes with a Neural Language Model

Christopher Chu; Raphael Valenti; Kevin Knight

arXiv:2010.04746·cs.CL·October 13, 2020

Solving Historical Dictionary Codes with a Neural Language Model

Christopher Chu, Raphael Valenti, Kevin Knight

PDF

Open Access

TL;DR

This paper introduces a neural language model-based method to decode historical substitution ciphers, successfully deciphering over 75% of tokens in 18th-century US-Spanish correspondence.

Contribution

It presents a novel decoding approach combining a lattice search with neural language models for complex historical ciphers.

Findings

01

Deciphered 75.1% of cipher tokens correctly

02

Applied method to 18th-century US-Spanish correspondence

03

Demonstrated effectiveness of neural models in cryptanalysis

Abstract

We solve difficult word-based substitution codes by constructing a decoding lattice and searching that lattice with a neural language model. We apply our method to a set of enciphered letters exchanged between US Army General James Wilkinson and agents of the Spanish Crown in the late 1700s and early 1800s, obtained from the US Library of Congress. We are able to decipher 75.1% of the cipher-word tokens correctly.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Authorship Attribution and Profiling