# Efficient, Lexicon-Free OCR using Deep Learning

**Authors:** Marcin Namysl, Iuliu Konya

arXiv: 1906.01969 · 2019-06-06

## TL;DR

This paper introduces a segmentation-free OCR system that leverages deep learning, synthetic data, and advanced augmentation to improve recognition of text in natural scenes with complex distortions and backgrounds.

## Contribution

It presents a novel deep learning-based OCR approach that is segmentation-free and uses synthetic data generation with complex augmentation techniques.

## Key findings

- Effective recognition of text in natural scenes.
- Synthetic data and augmentation improve model robustness.
- Deep learning models outperform traditional OCR methods.

## Abstract

Contrary to popular belief, Optical Character Recognition (OCR) remains a challenging problem when text occurs in unconstrained environments, like natural scenes, due to geometrical distortions, complex backgrounds, and diverse fonts. In this paper, we present a segmentation-free OCR system that combines deep learning methods, synthetic training data generation, and data augmentation techniques. We render synthetic training data using large text corpora and over 2000 fonts. To simulate text occurring in complex natural scenes, we augment extracted samples with geometric distortions and with a proposed data augmentation technique - alpha-compositing with background textures. Our models employ a convolutional neural network encoder to extract features from text images. Inspired by the recent progress in neural machine translation and language modeling, we examine the capabilities of both recurrent and convolutional neural networks in modeling the interactions between input elements.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.01969/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1906.01969/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1906.01969/full.md

---
Source: https://tomesphere.com/paper/1906.01969