Text Recognition in Real Scenarios with a Few Labeled Samples
Jinghuang Lin, Zhanzhan Cheng, Fan Bai, Yi Niu, Shiliang Pu, Shuigeng, Zhou

TL;DR
This paper introduces a few-shot adversarial sequence domain adaptation method for scene text recognition, enabling high accuracy with limited labeled target samples by aligning synthetic and real data at the character level.
Contribution
The proposed FASDA approach effectively adapts synthetic models to real-world scenarios with minimal labeled data, outperforming finetuning and matching state-of-the-art methods.
Findings
Significantly outperforms finetuning schemes.
Achieves comparable performance to state-of-the-art STR methods.
Effective in scenarios with scarce labeled data.
Abstract
Scene text recognition (STR) is still a hot research topic in computer vision field due to its various applications. Existing works mainly focus on learning a general model with a huge number of synthetic text images to recognize unconstrained scene texts, and have achieved substantial progress. However, these methods are not quite applicable in many real-world scenarios where 1) high recognition accuracy is required, while 2) labeled samples are lacked. To tackle this challenging problem, this paper proposes a few-shot adversarial sequence domain adaptation (FASDA) approach to build sequence adaptation between the synthetic source domain (with many synthetic labeled samples) and a specific target domain (with only some or a few real labeled samples). This is done by simultaneously learning each character's feature representation with an attention mechanism and establishing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Image Processing and 3D Reconstruction
