CT-Net: Arbitrary-Shaped Text Detection via Contour Transformer

Zhiwen Shao; Yuchen Su; Yong Zhou; Fanrong Meng; Hancheng Zhu; Bing; Liu; and Rui Yao

arXiv:2307.13310·cs.CV·July 26, 2023

CT-Net: Arbitrary-Shaped Text Detection via Contour Transformer

Zhiwen Shao, Yuchen Su, Yong Zhou, Fanrong Meng, Hancheng Zhu, Bing, Liu, and Rui Yao

PDF

TL;DR

This paper introduces CT-Net, a novel contour transformer-based framework for arbitrary-shaped scene text detection that improves accuracy and efficiency through progressive contour refinement and adaptive training strategies.

Contribution

The paper proposes a new contour-based text detection framework with iterative refinement and adaptive training, addressing limitations of previous methods in contour initialization and local information aggregation.

Findings

01

Achieves state-of-the-art accuracy on challenging datasets.

02

Operates at real-time speeds (around 10-11 FPS).

03

Demonstrates superior contour refinement capabilities.

Abstract

Contour based scene text detection methods have rapidly developed recently, but still suffer from inaccurate frontend contour initialization, multi-stage error accumulation, or deficient local information aggregation. To tackle these limitations, we propose a novel arbitrary-shaped scene text detection framework named CT-Net by progressive contour regression with contour transformers. Specifically, we first employ a contour initialization module that generates coarse text contours without any post-processing. Then, we adopt contour refinement modules to adaptively refine text contours in an iterative manner, which are beneficial for context information capturing and progressive global contour deformation. Besides, we propose an adaptive training strategy to enable the contour transformers to learn more potential deformation paths, and introduce a re-score mechanism that can effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.