SynthText3D: Synthesizing Scene Text Images from 3D Virtual Worlds
Minghui Liao, Boyu Song, Shangbang Long, Minghang He, Cong Yao, Xiang, Bai

TL;DR
SynthText3D introduces a novel method for generating realistic scene text images from 3D virtual worlds, enabling diverse and annotated training data for improved scene text detection models.
Contribution
The paper presents a new approach to synthesize scene text images by rendering entire 3D scenes with text, capturing real-world variations like perspective, illumination, and occlusion.
Findings
Synthetic data improves scene text detection performance.
Method outperforms previous 2D-based synthesis techniques.
Generated images exhibit realistic variations and occlusions.
Abstract
With the development of deep neural networks, the demand for a significant amount of annotated training data becomes the performance bottlenecks in many fields of research and applications. Image synthesis can generate annotated images automatically and freely, which gains increasing attention recently. In this paper, we propose to synthesize scene text images from the 3D virtual worlds, where the precise descriptions of scenes, editable illumination/visibility, and realistic physics are provided. Different from the previous methods which paste the rendered text on static 2D images, our method can render the 3D virtual scene and text instances as an entirety. In this way, real-world variations, including complex perspective transformations, various illuminations, and occlusions, can be realized in our synthesized scene text images. Moreover, the same text instances with various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Advanced Image and Video Retrieval Techniques
