Diff-Oracle: Deciphering Oracle Bone Scripts with Controllable Diffusion Model
Jing Li, Qiu-Feng Wang, Siyuan Wang, Rui Zhang, Kaizhu Huang, and Erik, Cambria

TL;DR
Diff-Oracle introduces a diffusion-based model with style and content control for generating oracle bone script images, significantly aiding in deciphering and recognition tasks with high accuracy.
Contribution
The paper presents Diff-Oracle, a novel diffusion model incorporating style and content encoders, to generate diverse and controllable oracle characters, advancing the field of oracle script deciphering.
Findings
Outperforms existing generative methods in image quality.
Achieves 7.70% accuracy improvement in zero-shot recognition.
Sets new benchmark with 84.62% accuracy on OBC306 dataset.
Abstract
Deciphering oracle bone scripts plays an important role in Chinese archaeology and philology. However, a significant challenge remains due to the scarcity of oracle character images. To overcome this issue, we propose Diff-Oracle, a novel approach based on diffusion models to generate a diverse range of controllable oracle characters. Unlike traditional diffusion models that operate primarily on text prompts, Diff-Oracle incorporates a style encoder that utilizes style reference images to control the generation style. This encoder extracts style prompts from existing oracle character images, where style details are converted into a text embedding format via a pretrained language-vision model. On the other hand, a content encoder is integrated within Diff-Oracle to capture specific content details from content reference images, ensuring that the generated characters accurately represent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion · Contrastive Language-Image Pre-training
