Domain Adaptive Scene Text Detection via Subcategorization
Zichen Tian, Chuhui Xue, Jingyi Zhang, Shijian Lu

TL;DR
This paper introduces SCAST, a subcategory-aware self-training method for domain adaptive scene text detection that improves transferability and reduces overfitting, achieving superior results across benchmarks.
Contribution
The paper proposes a novel subcategory-aware self-training approach, SCAST, for domain adaptive scene text detection, addressing overfitting and noisy labels in unlabelled target domains.
Findings
SCAST outperforms existing methods on multiple benchmarks.
It generalizes well to other domain adaptive detection tasks.
The approach effectively mitigates overfitting and label noise.
Abstract
Most existing scene text detectors require large-scale training data which cannot scale well due to two major factors: 1) scene text images often have domain-specific distributions; 2) collecting large-scale annotated scene text images is laborious. We study domain adaptive scene text detection, a largely neglected yet very meaningful task that aims for optimal transfer of labelled scene text images while handling unlabelled images in various new domains. Specifically, we design SCAST, a subcategory-aware self-training technique that mitigates the network overfitting and noisy pseudo labels in domain adaptive scene text detection effectively. SCAST consists of two novel designs. For labelled source data, it introduces pseudo subcategories for both foreground texts and background stuff which helps train more generalizable source models with multi-class detection objectives. For…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
