ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition
Qi Song, Qianyi Jiang, Nan Li, Rui Zhang, Xiaolin Wei

TL;DR
ReADS is a novel scene text recognition network combining CTC and Attn methods with attention mechanisms and rectification to improve accuracy on irregular text, trained end-to-end with only word-level annotations.
Contribution
It introduces a dual supervised network with complementary modules for CTC and Attn, plus attention mechanisms and rectification, advancing scene text recognition performance.
Findings
Achieves state-of-the-art results on multiple benchmarks.
Effectively handles irregular and complex scene text.
Outperforms existing methods in accuracy and robustness.
Abstract
In recent years, scene text recognition is always regarded as a sequence-to-sequence problem. Connectionist Temporal Classification (CTC) and Attentional sequence recognition (Attn) are two very prevailing approaches to tackle this problem while they may fail in some scenarios respectively. CTC concentrates more on every individual character but is weak in text semantic dependency modeling. Attn based methods have better context semantic modeling ability while tends to overfit on limited training data. In this paper, we elaborately design a Rectified Attentional Double Supervised Network (ReADS) for general scene text recognition. To overcome the weakness of CTC and Attn, both of them are applied in our method but with different modules in two supervised branches which can make a complementary to each other. Moreover, effective spatial and channel attention mechanisms are introduced to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction
