Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
Tianci Bi, Xiaoyi Zhang, Zhizheng Zhang, Wenxuan Xie, Cuiling Lan, Yan, Lu, Nanning Zheng

TL;DR
This paper introduces Text Grouping Adapter (TGA), a module that enables pre-trained text detectors to perform layout analysis by grouping text instances into paragraphs, improving efficiency and performance.
Contribution
The paper proposes TGA, a universal adapter that leverages pre-trained text detectors for layout analysis, allowing efficient fine-tuning and better utilization of existing detection datasets.
Findings
TGA improves layout analysis performance with frozen pre-trained models.
Incorporating TGA into various detectors enhances text grouping accuracy.
Fine-tuning TGA further boosts layout analysis results.
Abstract
Significant progress has been made in scene text detection models since the rise of deep learning, but scene text layout analysis, which aims to group detected text instances as paragraphs, has not kept pace. Previous works either treated text detection and grouping using separate models, or train a model from scratch while using a unified one. All of them have not yet made full use of the already well-trained text detectors and easily obtainable detection datasets. In this paper, we present Text Grouping Adapter (TGA), a module that can enable the utilization of various pre-trained text detectors to learn layout analysis, allowing us to adopt a well-trained text detector right off the shelf or just fine-tune it efficiently. Designed to be compatible with various text detector architectures, TGA takes detected text regions and image features as universal inputs to assemble text instance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Handwritten Text Recognition Techniques · Web Data Mining and Analysis
MethodsAdapter
