Recognition of Handwritten Textual Annotations using Tesseract Open Source OCR Engine for information Just In Time (iJIT)
Sandip Rakshit, Subhadip Basu, Hisashi Ikeda

TL;DR
This paper presents a method to recognize handwritten Roman script annotations using Tesseract OCR, creating user-specific models trained on individual handwriting samples, achieving recognition accuracy above 80% for five users.
Contribution
It develops a user-specific handwritten text recognition system with Tesseract, trained on individual samples, for improved accuracy in information Just In Time applications.
Findings
Recognition accuracy ranged from 81.53% to 92.88%.
System successfully segmented and recognized handwritten characters.
User-specific models improved recognition performance.
Abstract
Objective of the current work is to develop an Optical Character Recognition (OCR) engine for information Just In Time (iJIT) system that can be used for recognition of handwritten textual annotations of lower case Roman script. Tesseract open source OCR engine under Apache License 2.0 is used to develop user-specific handwriting recognition models, viz., the language sets, for the said system, where each user is identified by a unique identification tag associated with the digital pen. To generate the language set for any user, Tesseract is trained with labeled handwritten data samples of isolated and free-flow texts of Roman script, collected exclusively from that user. The designed system is tested on five different language sets with free- flow handwritten annotations as test samples. The system could successfully segment and subsequently recognize 87.92%, 81.53%, 92.88%, 86.75% and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Processing and 3D Reconstruction
