Deciphering Oracle Bone Language with Diffusion Models
Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu,, Lianwen Jin, Xiang Bai, Yuliang Liu

TL;DR
This paper presents a novel AI-based approach using diffusion models to decipher ancient Oracle Bone Script, overcoming traditional NLP limitations and demonstrating promising results in understanding this ancient language.
Contribution
Introduces Oracle Bone Script Decipher (OBSD), a diffusion-based method for deciphering ancient scripts, marking a new application of AI in historical linguistics.
Findings
Effective in generating decipherment clues
Outperforms traditional NLP methods
Validated on Oracle Bone Script dataset
Abstract
Originating from China's Shang Dynasty approximately 3,000 years ago, the Oracle Bone Script (OBS) is a cornerstone in the annals of linguistic history, predating many established writing systems. Despite the discovery of thousands of inscriptions, a vast expanse of OBS remains undeciphered, casting a veil of mystery over this ancient language. The emergence of modern AI technologies presents a novel frontier for OBS decipherment, challenging traditional NLP methods that rely heavily on large textual corpora, a luxury not afforded by historical languages. This paper introduces a novel approach by adopting image generation techniques, specifically through the development of Oracle Bone Script Decipher (OBSD). Utilizing a conditional diffusion-based strategy, OBSD generates vital clues for decipherment, charting a new course for AI-assisted analysis of ancient languages. To validate its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Data Mining Algorithms and Applications · Semantic Web and Ontologies
