Scene Text recognition with Full Normalization

Nathan Zachary; Gerald Carl; Russell Elijah; Hessi Roma; Robert Leer,; James Amelia

arXiv:2109.01034·cs.CV·September 3, 2021

Scene Text recognition with Full Normalization

Nathan Zachary, Gerald Carl, Russell Elijah, Hessi Roma, Robert Leer,, James Amelia

PDF

Open Access

TL;DR

This paper introduces a new dataset of smartphone-captured scene text images and demonstrates the effectiveness of profile normalization and augmentation techniques in improving scene text recognition performance.

Contribution

The paper presents a new publicly available dataset and highlights the benefits of profile normalization and data augmentation for scene text recognition on smartphones.

Findings

01

Profile normalization improves recognition accuracy.

02

Augmentation techniques enhance model robustness.

03

New dataset supports future research in mobile OCR.

Abstract

Scene text recognition has made significant progress in recent years and has become an important part of the work-flow. The widespread use of mobile devices opens up wide possibilities for using OCR technologies in everyday life. However, lack of training data for new research in this area remains relevant. In this article, we present a new dataset consisting of real shots on smartphones and demonstrate the effectiveness of profile normalization in this task. In addition, the influence of various augmentations during the training of models for analyzing document images on smartphones is studied in detail. Our dataset is publicly available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Video Analysis and Summarization · Image Processing and 3D Reconstruction