Towards Unified Multi-granularity Text Detection with Interactive Attention
Xingyu Wan, Chengquan Zhang, Pengyuan Lyu, Sen Fan, Zihan Ni, Kun Yao,, Errui Ding, Jingdong Wang

TL;DR
This paper presents DAT, a unified end-to-end model that detects text at multiple granularities using interactive attention, improving efficiency and accuracy across various text detection tasks.
Contribution
Introduction of DAT, a novel unified model with interactive attention for multi-granularity text detection and layout analysis, enhancing performance and versatility.
Findings
Achieves state-of-the-art results on multiple benchmarks.
Effectively handles multi-oriented and arbitrarily-shaped texts.
Improves detection accuracy for complex layouts.
Abstract
Existing OCR engines or document image analysis systems typically rely on training separate models for text detection in varying scenarios and granularities, leading to significant computational complexity and resource demands. In this paper, we introduce "Detect Any Text" (DAT), an advanced paradigm that seamlessly unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model. This design enables DAT to efficiently manage text instances at different granularities, including *word*, *line*, *paragraph* and *page*. A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances at varying granularities by correlating structural information across different text queries. As a result, it enables the model to achieve mutually beneficial detection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling
