TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition
Tianlun Zheng, Zhineng Chen, Jinfeng Bai, Hongtao Xie, Yu-Gang Jiang

TL;DR
TPS++ introduces an attention-enhanced thin-plate spline method for scene text recognition, improving rectification accuracy by incorporating content-aware mechanisms, leading to state-of-the-art results with minimal additional computational cost.
Contribution
The paper proposes TPS++, a novel attention-based TPS transformation that jointly models content and control points for improved text rectification in scene text recognition.
Findings
Achieves state-of-the-art recognition accuracy on benchmarks.
Generalizes well across different backbones and recognizers.
Increases rectification quality with minimal overhead.
Abstract
Text irregularities pose significant challenges to scene text recognizers. Thin-Plate Spline (TPS)-based rectification is widely regarded as an effective means to deal with them. Currently, the calculation of TPS transformation parameters purely depends on the quality of regressed text borders. It ignores the text content and often leads to unsatisfactory rectified results for severely distorted text. In this work, we introduce TPS++, an attention-enhanced TPS transformation that incorporates the attention mechanism to text rectification for the first time. TPS++ formulates the parameter calculation as a joint process of foreground control point regression and content-based attention score estimation, which is computed by a dedicated designed gated-attention block. TPS++ builds a more flexible content-aware rectifier, generating a natural text correction that is easier to read by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Neural Network Applications · Multimodal Machine Learning Applications
MethodsContent-based Attention
