Scene Text recognition with Full Normalization
Nathan Zachary, Gerald Carl, Russell Elijah, Hessi Roma, Robert Leer,, James Amelia

TL;DR
This paper introduces a new dataset of smartphone-captured scene text images and demonstrates the effectiveness of profile normalization and augmentation techniques in improving scene text recognition performance.
Contribution
The paper presents a new publicly available dataset and highlights the benefits of profile normalization and data augmentation for scene text recognition on smartphones.
Findings
Profile normalization improves recognition accuracy.
Augmentation techniques enhance model robustness.
New dataset supports future research in mobile OCR.
Abstract
Scene text recognition has made significant progress in recent years and has become an important part of the work-flow. The widespread use of mobile devices opens up wide possibilities for using OCR technologies in everyday life. However, lack of training data for new research in this area remains relevant. In this article, we present a new dataset consisting of real shots on smartphones and demonstrate the effectiveness of profile normalization in this task. In addition, the influence of various augmentations during the training of models for analyzing document images on smartphones is studied in detail. Our dataset is publicly available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Video Analysis and Summarization · Image Processing and 3D Reconstruction
