Scene Text Image Super-resolution based on Text-conditional Diffusion   Models

Chihiro Noguchi; Shun Fukuda; Masao Yamanaka

arXiv:2311.09759·cs.CV·December 25, 2023·2 cites

Scene Text Image Super-resolution based on Text-conditional Diffusion Models

Chihiro Noguchi, Shun Fukuda, Masao Yamanaka

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel scene text image super-resolution framework using text-conditional diffusion models, significantly improving image quality and dataset synthesis for better scene text recognition.

Contribution

It leverages text-conditional diffusion models for super-resolution and dataset synthesis, surpassing existing methods and enhancing STISR performance.

Findings

01

Text-conditional DMs outperform existing STISR methods.

02

Synthesized LR-HR image pairs improve STISR training.

03

Proposed framework enhances scene text recognition accuracy.

Abstract

Scene Text Image Super-resolution (STISR) has recently achieved great success as a preprocessing method for scene text recognition. STISR aims to transform blurred and noisy low-resolution (LR) text images in real-world settings into clear high-resolution (HR) text images suitable for scene text recognition. In this study, we leverage text-conditional diffusion models (DMs), known for their impressive text-to-image synthesis capabilities, for STISR tasks. Our experimental results revealed that text-conditional DMs notably surpass existing STISR methods. Especially when texts from LR text images are given as input, the text-conditional DMs are able to produce superior quality super-resolution text images. Utilizing this capability, we propose a novel framework for synthesizing LR-HR paired text image datasets. This framework consists of three specialized text-conditional DMs, each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

toyotainfotech/stisr-tcdm
pytorchOfficial

Videos

Scene Text Image Super-Resolution Based on Text-Conditional Diffusion Models· youtube

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods

MethodsDiffusion