Sequence to sequence learning for unconstrained scene text recognition
Ahmed Mamdouh A. Hassanien

TL;DR
This paper introduces a cascade CNN-LSTM model for unconstrained scene text recognition, significantly improving accuracy by modeling character relationships and generalizing well to unseen data, with applications in traffic monitoring.
Contribution
The paper presents a novel CNN-LSTM cascade approach that enhances scene text recognition accuracy and generalization to unseen words in unconstrained environments.
Findings
Achieved state-of-the-art accuracy on ICDAR 13 dataset.
LSTM encoding reduces false positives and negatives.
Model generalizes well to unseen data.
Abstract
In this work we present a state-of-the-art approach for unconstrained natural scene text recognition. We propose a cascade approach that incorporates a convolutional neural network (CNN) architecture followed by a long short term memory model (LSTM). The CNN learns visual features for the characters and uses them with a softmax layer to detect sequence of characters. While the CNN gives very good recognition results, it does not model relation between characters, hence gives rise to false positive and false negative cases (confusing characters due to visual similarities like "g" and "9", or confusing background patches with characters; either removing existing characters or adding non-existing ones) To alleviate these problems we leverage recent developments in LSTM architectures to encode contextual information. We show that the LSTM can dramatically reduce such errors and achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
MethodsSigmoid Activation · Tanh Activation · Softmax · Long Short-Term Memory
