Text Spotting Transformers
Xiang Zhang, Yongwen Su, Subarna Tripathi, Zhuowen Tu

TL;DR
TESTR introduces a Transformer-based end-to-end framework for text detection and recognition that handles curved and arbitrarily shaped text without region-of-interest operations or heuristics, achieving state-of-the-art results.
Contribution
The paper proposes a novel Transformer-based text spotting framework that effectively detects and recognizes curved text without traditional heuristics or ROI operations.
Findings
State-of-the-art performance on curved text datasets
Effective detection of arbitrarily shaped text instances
No reliance on heuristics-driven post-processing
Abstract
In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild. TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition. Other than most existing literature, our method is free from Region-of-Interest operations and heuristics-driven post-processing procedures; TESTR is particularly effective when dealing with curved text-boxes where special cares are needed for the adaptation of the traditional bounding-box representations. We show our canonical representation of control points suitable for text instances in both Bezier curve and polygon annotations. In addition, we design a bounding-box guided polygon detection (box-to-polygon) process. Experiments on curved and arbitrarily shaped datasets demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Vehicle License Plate Recognition
