Visual Semantic Re-ranker for Text Spotting

Ahmed Sabir; Francesc Moreno-Noguer; Llu\'is Padr\'o

arXiv:1810.09776·cs.CV·October 30, 2018

Visual Semantic Re-ranker for Text Spotting

Ahmed Sabir, Francesc Moreno-Noguer, Llu\'is Padr\'o

PDF

Open Access 1 Repo

TL;DR

This paper introduces a semantic re-ranking method for text spotting that leverages visual context to improve recognition accuracy with minimal additional computation.

Contribution

It presents a novel post-processing re-ranking approach that uses semantic relations between text and scene context to enhance existing text recognition systems.

Findings

01

Improves text spotting accuracy on ICDAR'17 dataset

02

Compatible as a drop-in enhancement for existing systems

03

Achieves performance boost with low computational cost

Abstract

Many current state-of-the-art methods for text recognition are based on purely local information and ignore the semantic correlation between text and its surrounding visual context. In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene. We initially rely on an off-the-shelf deep neural network that provides a series of text hypotheses for each input image. These text hypotheses are then re-ranked using the semantic relatedness with the object in the image. As a result of this combination, the performance of the original network is boosted with a very low computational cost. The proposed framework can be used as a drop-in complement for any text-spotting algorithm that outputs a ranking of word hypotheses. We validate our approach on ICDAR'17 shared task dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ahmedssabir/Visual-Semantic-Relatedness-with-Word-Embedding
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Video Analysis and Summarization