A Masked Bounding-Box Selection Based ResNet Predictor for Text Rotation Prediction
Michael Yang, Yuan Lin, and ChiuMan Ho

TL;DR
This paper introduces a novel masked bounding-box selection method that enhances a ResNet predictor's ability to accurately determine text rotation angles in images, significantly improving OCR performance on rotated texts by focusing on relevant regions and reducing background noise interference.
Contribution
The paper proposes a new masked bounding-box selection technique integrated with a ResNet predictor to improve text rotation prediction accuracy in OCR systems.
Findings
Significant performance improvement over traditional methods.
Effective reduction of background noise influence.
Enhanced robustness in rotated text recognition.
Abstract
The existing Optical Character Recognition (OCR) systems are capable of recognizing images with horizontal texts. However, when the rotation of the texts increases, it becomes harder to recognizing these texts. The performance of the OCR systems decreases. Thus predicting the rotations of the texts and correcting the images are important. Previous work mainly uses traditional Computer Vision methods like Hough Transform and Deep Learning methods like Convolutional Neural Network. However, all of these methods are prone to background noises commonly existing in general images with texts. To tackle this problem, in this work, we introduce a new masked bounding-box selection method, that incorporating the bounding box information into the system. By training a ResNet predictor to focus on the bounding box as the region of interest (ROI), the predictor learns to overlook the background…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction
MethodsBatch Normalization · Max Pooling · Average Pooling · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Kaiming Initialization · Residual Block · Convolution · Residual Connection · Global Average Pooling
