A Novel Integrated Framework for Learning both Text Detection and Recognition
Wanchen Sui, Qing Zhang, Jun Yang, Wei Chu

TL;DR
This paper introduces an integrated end-to-end framework that combines text detection and recognition into a single trainable model, improving accuracy and speed while reducing computational load.
Contribution
The novel approach merges detection and recognition models with shared parameters and employs a convolutional sequence learning method, enhancing efficiency and accuracy.
Findings
Achieves high accuracy on multiple datasets.
Significantly faster inference due to convolutional recognition network.
Reduces computational load during inference.
Abstract
In this paper, we propose a novel integrated framework for learning both text detection and recognition. For most of the existing methods, detection and recognition are treated as two isolated tasks and trained separately, since parameters of detection and recognition models are different and two models target to optimize their own loss functions during individual training processes. In contrast to those methods, by sharing model parameters, we merge the detection model and recognition model into a single end-to-end trainable model and train the joint model for two tasks simultaneously. The shared parameters not only help effectively reduce the computational load in inference process, but also improve the end-to-end text detection-recognition accuracy. In addition, we design a simpler and faster sequence learning method for the recognition network based on a succession of stacked…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Text and Document Classification Technologies · Image Processing and 3D Reconstruction
