MSLKANet: A Multi-Scale Large Kernel Attention Network for Scene Text Removal
Guangtao Lyu (School of Computer Science, Artificial Intelligence,, Wuhan University of Technology, China)

TL;DR
MSLKANet is a novel multi-scale large kernel attention network designed for scene text removal, effectively capturing global information and long-range dependencies to improve background filling in full images.
Contribution
The paper introduces MSLKANet with multi-scale large kernel attention and a large kernel spatial pyramid pooling mechanism, enhancing scene text removal by capturing extensive contextual information.
Findings
Achieves state-of-the-art results on synthetic datasets
Performs effectively on real-world images
Validates the effectiveness of MSLKA and LKSPP components
Abstract
Scene text removal aims to remove the text and fill the regions with perceptually plausible background information in natural images. It has attracted increasing attention due to its various applications in privacy protection, scene text retrieval, and text editing. With the development of deep learning, the previous methods have achieved significant improvements. However, most of the existing methods seem to ignore the large perceptive fields and global information. The pioneer method can get significant improvements by only changing training data from the cropped image to the full image. In this paper, we present a single-stage multi-scale network MSLKANet for scene text removal in full images. For obtaining large perceptive fields and global information, we propose multi-scale large kernel attention (MSLKA) to obtain long-range dependencies between the text regions and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Handwritten Text Recognition Techniques
MethodsSpatial Pyramid Pooling
