UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World
Shangbang Long, Cong Yao

TL;DR
UnrealText introduces a 3D graphics engine-based method for synthesizing realistic scene text images, improving training data quality for text detection and recognition, including multilingual and detailed annotations.
Contribution
The paper presents UnrealText, a novel synthetic image generation approach using 3D rendering for more realistic and informative scene text datasets, enhancing training effectiveness.
Findings
Effective for scene text detection and recognition
Generates multilingual scene text datasets
Improves training data realism and annotation detail
Abstract
Synthetic data has been a critical tool for training scene text detection and recognition models. On the one hand, synthetic word images have proven to be a successful substitute for real images in training scene text recognizers. On the other hand, however, scene text detectors still heavily rely on a large amount of manually annotated real-world images, which are expensive. In this paper, we introduce UnrealText, an efficient image synthesis method that renders realistic images via a 3D graphics engine. 3D synthetic engine provides realistic appearance by rendering scene and text as a whole, and allows for better text region proposals with access to precise scene information, e.g. normal and even object meshes. The comprehensive experiments verify its effectiveness on both scene text detection and recognition. We also generate a multilingual version for future research into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
UnrealText: Synthesizing Realistic Scene Text Images From the Unreal World· youtube
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Video Analysis and Summarization
