Pushing the Performance Limit of Scene Text Recognizer without Human   Annotation

Caiyuan Zheng; Hui Li; Seon-Min Rhee; Seungju Han; Jae-Joon Han; Peng; Wang

arXiv:2204.07714·cs.CV·May 24, 2022

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

Caiyuan Zheng, Hui Li, Seon-Min Rhee, Seungju Han, Jae-Joon Han, Peng, Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a semi-supervised framework using consistency regularization to enhance scene text recognition models by effectively leveraging synthetic and unlabeled real images, achieving state-of-the-art results without human annotation.

Contribution

It presents the first consistency regularization framework for scene text recognition, improving model performance by utilizing unlabeled data and addressing domain gap issues.

Findings

01

Significant performance improvements on standard benchmarks.

02

Achieved new state-of-the-art results in scene text recognition.

03

Effectively mitigated domain gap between synthetic and real images.

Abstract

Scene text recognition (STR) attracts much attention over the years because of its wide application. Most methods train STR model in a fully supervised manner which requires large amounts of labeled data. Although synthetic data contributes a lot to STR, it suffers from the real-tosynthetic domain gap that restricts model performance. In this work, we aim to boost STR models by leveraging both synthetic data and the numerous real unlabeled images, exempting human annotation cost thoroughly. A robust consistency regularization based semi-supervised framework is proposed for STR, which can effectively solve the instability issue due to domain inconsistency between synthetic and real images. A character-level consistency regularization is designed to mitigate the misalignment between characters in sequence recognition. Extensive experiments on standard text recognition benchmarks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Caiyuan-Zheng/Consistency_Regularization_STR
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Text and Document Classification Technologies · Image Retrieval and Classification Techniques