Word level Script Identification from Bangla and Devanagri Handwritten Texts mixed with Roman Script
Ram Sarkar, Nibaran Das, Subhadip Basu, Mahantapas Kundu, Mita, Nasipuri, Dipak Kumar Basu

TL;DR
This paper introduces a system for automatically identifying and separating handwritten Bangla, Devanagri, and Roman scripts in mixed-language documents using feature-based classification, achieving over 98% accuracy.
Contribution
It presents a novel script separation method combining script-independent text line extraction with a multi-layer perceptron classifier trained on holistic features.
Findings
Achieved 99.29% accuracy for Bangla-Roman script separation.
Achieved 98.43% accuracy for Devanagri-Roman script separation.
Demonstrated effectiveness on two distinct datasets.
Abstract
India is a multi-lingual country where Roman script is often used alongside different Indic scripts in a text document. To develop a script specific handwritten Optical Character Recognition (OCR) system, it is therefore necessary to identify the scripts of handwritten text correctly. In this paper, we present a system, which automatically separates the scripts of handwritten words from a document, written in Bangla or Devanagri mixed with Roman scripts. In this script separation technique, we first, extract the text lines and words from document pages using a script independent Neighboring Component Analysis technique. Then we have designed a Multi Layer Perceptron (MLP) based classifier for script separation, trained with 8 different wordlevel holistic features. Two equal sized datasets, one with Bangla and Roman scripts and the other with Devanagri and Roman scripts, are prepared for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction
