Decoding Ancient Oracle Bone Script via Generative Dictionary Retrieval

Yin Wu; Gangjian Zhang; Jiayu Chen; Chang Xu; Yuyu Luo; Nan Tang; Hui Xiong

arXiv:2604.09668·cs.IR·April 14, 2026

Decoding Ancient Oracle Bone Script via Generative Dictionary Retrieval

Yin Wu, Gangjian Zhang, Jiayu Chen, Chang Xu, Yuyu Luo, Nan Tang, Hui Xiong

PDF

TL;DR

This paper presents a deep learning-based, dictionary retrieval approach to deciphering ancient Oracle Bone Script, significantly improving accuracy and interpretability over previous methods.

Contribution

It introduces a novel dictionary-based retrieval framework guided by character evolution principles to decipher undeciphered ancient scripts.

Findings

01

Achieves 54.3% Top-10 accuracy on unseen characters

02

Achieves 86.6% Top-50 accuracy on unseen characters

03

Provides a scalable, interpretable method for archaeological decipherment

Abstract

Understanding humanity's earliest writing systems is crucial for reconstructing civilization's origins, yet many ancient scripts remain undeciphered. Oracle Bone Script (OBS) from China's Shang dynasty exemplifies this challenge: only approximately 1,500 of roughly 4,600 characters have been decoded, and a substantial portion of these 3,000-year-old inscriptions remains only partially understood. Limited by extreme data scarcity, existing computational methods achieve under 3% accuracy on unseen characters -- the core palaeographic challenge. We overcome this by reframing decipherment from classification to dictionary-based retrieval. Using deep learning guided by character evolution principles, we generate a comprehensive synthetic dictionary of plausible OBS variants for modern Chinese characters. Scholars query unknown inscriptions to retrieve visually similar candidates with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.