I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
Bo Du, Jian Ye, Jing Zhang, Juhua Liu, and Dacheng Tao

TL;DR
The paper introduces I3CL, a novel end-to-end framework for arbitrary-shaped scene text detection that leverages intra- and inter-instance collaborative learning, a semi-supervised approach, and achieves state-of-the-art results on multiple benchmarks.
Contribution
It proposes a new collaborative learning framework combining convolutional modules, transformer-based dependencies, and semi-supervised learning for improved scene text detection.
Findings
Achieves state-of-the-art F-measure scores on three benchmarks.
First place on ICDAR2019-ArT leaderboard with ResNeSt-101 backbone.
Effective semi-supervised learning with pseudo labels enhances detection performance.
Abstract
Existing methods for arbitrary-shaped text detection in natural scenes face two critical issues, i.e., 1) fracture detections at the gaps in a text instance; and 2) inaccurate detections of arbitrary-shaped text instances with diverse background context. To address these issues, we propose a novel method named Intra- and Inter-Instance Collaborative Learning (I3CL). Specifically, to address the first issue, we design an effective convolutional module with multiple receptive fields, which is able to collaboratively learn better character and gap feature representations at local and long ranges inside a text instance. To address the second issue, we devise an instance-based transformer module to exploit the dependencies between different text instances and a global context module to exploit the semantic context from the shared background, which are able to collaboratively learn more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Multimodal Machine Learning Applications
