A prototype system for handwritten sub-word recognition: Toward Arabic-manuscript transliteration
Reza Farrahi Moghaddam, Mohamed Cheriet, Thomas Milo, Robert, Wisnovsky

TL;DR
This paper presents a prototype system for recognizing and transliterating sub-words in handwritten Arabic manuscripts, including dot-less scripts, using skeleton features and binary classifiers, with promising initial results.
Contribution
It introduces a novel approach combining skeleton features and binary classifiers for sub-word recognition in Arabic manuscripts, including archigraphemic scripts.
Findings
System successfully recognizes sub-words with promising accuracy.
Binary classifiers reduce recognition complexity effectively.
System adaptable to various Arabic manuscript styles.
Abstract
A prototype system for the transliteration of diacritics-less Arabic manuscripts at the sub-word or part of Arabic word (PAW) level is developed. The system is able to read sub-words of the input manuscript using a set of skeleton-based features. A variation of the system is also developed which reads archigraphemic Arabic manuscripts, which are dot-less, into archigraphemes transliteration. In order to reduce the complexity of the original highly multiclass problem of sub-word recognition, it is redefined into a set of binary descriptor classifiers. The outputs of trained binary classifiers are combined to generate the sequence of sub-word letters. SVMs are used to learn the binary classifiers. Two specific Arabic databases have been developed to train and test the system. One of them is a database of the Naskh style. The initial results are promising. The systems could be trained on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
