TL;DR
This paper introduces a few-shot learning method for recognizing handwritten ciphered manuscripts, effectively handling unseen alphabets and limited labeled data through synthetic training and fine-tuning.
Contribution
It presents a novel few-shot object detection approach for cipher recognition, capable of recognizing new alphabets with minimal labeled data and surpassing existing methods.
Findings
Recognizes unseen cipher alphabets using synthetic training.
Outperforms existing HTR methods with few labeled pages.
Effective in handling touching symbols and variable cipher alphabets.
Abstract
Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
