A3S: Adversarial learning of semantic representations for Scene-Text   Spotting

Masato Fujitake

arXiv:2302.10641·cs.CV·February 22, 2023

A3S: Adversarial learning of semantic representations for Scene-Text Spotting

Masato Fujitake

PDF

Open Access

TL;DR

This paper introduces A3S, an adversarial learning approach that enhances scene-text spotting by jointly predicting semantic features, leading to improved end-to-end accuracy in recognizing meaningful text in natural images.

Contribution

The paper proposes a novel adversarial learning framework that predicts semantic features alongside text recognition, addressing the gap in end-to-end accuracy in scene-text spotting.

Findings

01

Achieves higher accuracy than existing methods on public datasets.

02

Effectively predicts semantic features to improve text recognition.

03

Enhances end-to-end scene-text spotting performance.

Abstract

Scene-text spotting is a task that predicts a text area on natural scene images and recognizes its text characters simultaneously. It has attracted much attention in recent years due to its wide applications. Existing research has mainly focused on improving text region detection, not text recognition. Thus, while detection accuracy is improved, the end-to-end accuracy is insufficient. Texts in natural scene images tend to not be a random string of characters but a meaningful string of characters, a word. Therefore, we propose adversarial learning of semantic representations for scene text spotting (A3S) to improve end-to-end accuracy, including text recognition. A3S simultaneously predicts semantic features in the detected text area instead of only performing text recognition based on existing visual features. Experimental results on publicly available datasets show that the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Face recognition and analysis