Scene Text Image Super-Resolution via Content Perceptual Loss and   Criss-Cross Transformer Blocks

Rui Qin; Bin Wang; Yu-Wing Tai

arXiv:2210.06924·cs.CV·October 14, 2022

Scene Text Image Super-Resolution via Content Perceptual Loss and Criss-Cross Transformer Blocks

Rui Qin, Bin Wang, Yu-Wing Tai

PDF

Open Access

TL;DR

This paper introduces TATSR, a novel text super-resolution framework utilizing Criss-Cross Transformer Blocks and Content Perceptual Loss to improve text readability and recognition across multiple languages.

Contribution

The paper proposes a new framework with orthogonal transformer-based content extraction and a content-aware loss, enhancing text super-resolution performance and generalizability.

Findings

01

Outperforms state-of-the-art methods in recognition accuracy

02

Improves human perception of reconstructed text images

03

Effective across multiple languages

Abstract

Text image super-resolution is a unique and important task to enhance readability of text images to humans. It is widely used as pre-processing in scene text recognition. However, due to the complex degradation in natural scenes, recovering high-resolution texts from the low-resolution inputs is ambiguous and challenging. Existing methods mainly leverage deep neural networks trained with pixel-wise losses designed for natural image reconstruction, which ignore the unique character characteristics of texts. A few works proposed content-based losses. However, they only focus on text recognizers' accuracy, while the reconstructed images may still be ambiguous to humans. Further, they often have weak generalizability to handle cross languages. To this end, we present TATSR, a Text-Aware Text Super-Resolution framework, which effectively learns the unique text characteristics using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques

MethodsMulti-Head Attention · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Layer Normalization · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Adam · Dense Connections