Shift Variance in Scene Text Detection

Markus Glitzner; Jan-Hendrik Neudeck; Philipp H\"artinger

arXiv:2208.09231·cs.CV·August 22, 2022

Shift Variance in Scene Text Detection

Markus Glitzner, Jan-Hendrik Neudeck, Philipp H\"artinger

PDF

Open Access

TL;DR

This paper investigates the shift variance problem in scene text detection, demonstrating how architectural modifications and smoothing filters can improve shift consistency, and proposes a new metric to quantify this variability.

Contribution

It reveals the inherent shift variance in state-of-the-art text detectors and introduces architectural adjustments and a new metric to enhance and measure shift equivariance.

Findings

01

Small architectural changes improve shift equivariance.

02

Adding smoothing filters significantly enhances shift consistency.

03

Proposed metric effectively quantifies shift variability in text detectors.

Abstract

Theory of convolutional neural networks suggests the property of shift equivariance, i.e., that a shifted input causes an equally shifted output. In practice, however, this is not always the case. This poses a great problem for scene text detection for which a consistent spatial response is crucial, irrespective of the position of the text in the scene. Using a simple synthetic experiment, we demonstrate the inherent shift variance of a state-of-the-art fully convolutional text detector. Furthermore, using the same experimental setting, we show how small architectural changes can lead to an improved shift equivariance and less variation of the detector output. We validate the synthetic results using a real-world training schedule on the text detection network. To quantify the amount of shift variability, we propose a metric based on well-established text detection benchmarks. While…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques